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Preface 


This book is meant to be a practical introduction to data analysis in high- 
energy physics experiments, especially collider experiments using high-energy 
accelerators. 

The field requires a very wide range of knowledge, not only for the theoretical 
particle physics but also for the detector technology and computing science. We 
often find beginners in this field suffering from understanding how data analysis 
is taking place, simply because of too many things one needs to know. Reading 
journal papers often does not help since comprehensive understanding and training 
are required, which are not described in the papers themselves. It is quite difficult 
to obtain such skills unless you do once a data analysis by yourself. We hope that 
this book helps reduce such difficulties by providing a one-stop “explanation” on 
key aspects of data analysis. 

This book should also serve as an introductory textbook for those who are 
learning about individual subjects in data analysis, such as 


Basic idea on methods to reconstruct and identify particles; 

Detector calibration in collider experiments; 

Statistical methods used in collider experiments; 

Methods to increase sensitivities of an experiment through data analysis 
techniques; 

e Simulation of particle collisions and detector responses. 


This book is intended for undergraduate and first-year graduate students who 
have taken basic-level courses in particle physics. Throughout the book, we tried 
to explain what happens without explicitly using many equations. We neither cover 
the formalism on the interaction of particles in matter nor the theory of the particle 
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physics Standard Model. Instead, we, experimental physicists, provide a “prac- 
tical” explanation of how to understand after considering those formalism and 
theories. 
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Introduction 


Experimental techniques in high-energy particle physics have been developing very 
rapidly, despite the experimental principle being relatively simple. Here, we explain 
the principle first, then the development on top, since the experiments are more 
complex as the energy of collisions increases. 

The purpose of the high-energy experiments can be classified roughly into two 
categories: to find any new particle by converting the collision energy to particle 
mass, and to investigate the nature of the interaction in order to find if any new 
feature exists there. The first category produces a new particle through resonance 
or radiation in the final state. The second category may be an indirect detection 
of new particle contribution in the interaction or discovery of sub-structure of the 
“elementary” particles through precise measurements of, for example, scattering 
angles. 

Figure 1.1 shows two patterns (Feynman diagrams) of ete~ collisions corre- 
sponding to the two categories of experiments. The most typical interaction of the 
first category experiments, production of a new particle, is the resonance of state X in 
ete” annihilation as depicted in Fig. 1.1a. In most or all collisions, the state X cor- 
responds to known particle(s), for example Z® /y, but may contain a contribution 
from a new particle. If there is a contribution from such a new particle, the invariant 
mass of the decay product reconstructed using the energy and momentum of all the 
decay particles from X may show a peak from the resonance, corresponding to the 
mass of the unknown. As the simplest example, if we assume that X decays into two 
particles, we can observe a peak in the invariant mass spectrum of the two particles. 

For the second category of the experiments, namely to investigate interactions 
and sub-structure of particles, we instead like to use processes where an incoming 
particle scatters off the other via an exchange of a state X’ as represented by Fig. 1. 1b. 
Depending on the nature of the exchange X’, the scatters will give a certain prediction 
on the angular and momentum distribution of the final state particles. For example, for 
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2 1 Introduction 


(a) 


Fig. 1.1 Feynman diagrams of ete™ interactions a annihilating each other producing a virtual 
particle X (decaying a pair of particles) and b exchanging a particle X’ between them 


the Coulomb scattering, we know that the scattering angle of an electron beam with a 
heavy point-like target particle, such as nucleus, behaves like 1 / sinf (0/2), where @ is 
the scattering angle with respect to the incoming beam direction in target-rest frame 
(note that the spin effect to the scattering angle is ignored here). The contribution 
from a new particle exchanged as the state X’ would give a small modification to 
the fundamental Rutherford scattering behaviour of 1/ sin*(6/2). The effect should 
be enhanced beyond the energy regime where the momentum transfer squared of 
the exchanged particle is beyond the mass squared of the particle responsible for the 
new interaction, such as an unknown new heavy gauge boson. 

We conduct high-energy experiments for research based on these principals and 
need to consider detector, data-taking, reconstruction, identification, calibration, 
analysis, etc. The 2 — 2 processes, as illustrated in Fig. 1.1a, b, can be fully recon- 
structed if the momenta and energies of these two outgoing particles are measured; 
the experiment for such a case is simple. In reality, the number of particles to be 
measured may be more than two in many cases—in fact, far more for typical high- 
energy scatterings. There may be the radiation of particles from the 2-to-2 process 
as well as decay products of heavy elementary particles, such as W and Z bosons 
and top quarks, which decay further into many particles. 

The situation is particularly difficult if particles from the outgoing particles may 
contain neutrinos or any other unknown neutral particles which interact only weakly 
with detector materials. Such neutral particles escape the detector, practically always. 
The only way to “detect” them is to measure the so-called “missing momentum” 
using four-momentum conservation, which corresponds to momentum carried by 
the neutral particles. For that, we need to measure all the other particles in the final 
state. The detector, therefore, needs to cover the interaction point almost fully, i.e. 
the solid angle of the coverage should be close to 42. Such kind of detector is 
called a hermetic detector. A neutrino becomes a common object once the energy 
of the collision is high enough that one can easily produce W and Z bosons, since 
they produce neutrinos through the decay W — £v or Z —> vv where £ is either of 
e, p, T and v = ve, Vy Or vr. This is the reason why any modern high-energy collider 
experiments cover almost all the solid angles to have a hermetic system. 

The density of particles in the detector is also an issue at high energies. As the 
energy of the collision becomes higher, the number of particles increases approxi- 


1 Introduction 3 


mately proportional to In V/s, where s is the square of the centre-of-mass energy of 
the collision. This makes the angular density higher; in particular, many particles 
are produced in a small angular area when an energetic quark or gluon is produced 
and fragmented. These collimated bunches of particles are called jet. The presence 
of jets also requires the detector to have small segmentation—or fine “granularity”, 
we often call—such that a pair of particles produced close by each other can be 
distinguished as two particles. In addition, the increase of the collision energy also 
asks for more material to stop neutral particles (not neutrinos but neutrons) in order 
to measure them. The detector becomes thicker with energy, again « In ./s, and the 
overall size of the detector becomes larger. 

Furthermore, the modern high-energy experiments should deal with many dif- 
ferent types of stable particles. The detector has to measure electrons, muons and 
photons very precisely. Hadrons (pions and kaons, practically) are also copiously pro- 
duced as explained above. The energy measurement of charged hadrons or baryons 
at high precision is particularly difficult. The identification of the species of hadrons, 
such as the pion, the kaon and other particles, may be desirable if one needs to study 
decay chains of particular mesons. 

In addition, the presence of b-quark is a very useful signal for investigating physics 
involving quark flavour, in particular for top quarks, since a top quark decays to W 
and a b-quark, whose branching ratio is practically 100%. As for leptons, produc- 
tion of high momentum t could be an indication of the presence of new physics of 
special flavour structure, such as some bosons preferentially couple to third genera- 
tion fermions. Identification of these particles is based on the fact that they fly short 
distance before decaying, calling for a very precise tracking device. 

Thus, modern high-energy collider detectors should be able to deal with all sorts 
of stable leptons and hadrons. In general, the design of detectors should be opti- 
mised to the target physics. For collider detectors, however, the versatility is more 
respected since there are many different targets; detectors need to measure known 
Standard-Model (SM) processes precisely while they also cope with some pecu- 
liar signals from new particles. Therefore, “general-purpose” detectors are preferred 
and measurements on both known and unknown processes are performed with good 
precision. 

Yet another to consider at high energy is that the probability of observing particular 
processes, i.e. cross sections (see Sect. 2.2) decreases with energy of collisions in 
general. A simple dimensional analysis tells us that the cross section of point-like 
particles, such as ete or parton-parton collisions, decreases like 1/s, or by 1/E? 
in terms of the incoming particle energy E for symmetric collisions. This means that 
the probability of interactions you like to find is suppressed by 1/E7. This should be 
compensated by increasing luminosity (see Sect. 2.2), which is proportional to the 
number of collisions of a given process per unit time. This is realised by a shorter time 
interval of collisions, higher beam current and better focusing of beams in colliders. 

Now the problem is that this increases not only signal but also background rates. 
In particular, for hadron-hadron collisions, the cross section is dominated by a soft 
process (see Sect. 2.5), which is approximately constant in collision energy instead 
of 1/E?. The soft process is the background for most physics analyses. The increase 
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in the number of collisions per unit of time also leads to a higher probability of 
pile-up of multiple collisions. The issue of pile-up should be resolved by a detector 
system with high resolution, both in space and time. Moreover, higher rates impose 
an additional challenge for data-taking. 

Last but not the least, for precision measurements it is also important that exper- 
iments are well modelled by simulation. So-called data-driven methods are adopted 
to reduce uncertainties of measurements for the estimation of background contri- 
butions. However, in most cases, we need help with the simulation, which causes 
additional uncertainties. In data analysis, the cutting-edge technique utilising the 
modern statistical technique including machine learning should help in increasing 
the experimental sensitivities. 

In summary, modern high-energy experimental physics, specifically the collider 
experiments, should pay attention to the following items, even though the basic idea 
of experiments in terms of physics goal remains similar to that of lower energy 
experiments. Good reconstruction of all sorts of stable particles including neutrinos 
should be possible with well-calibrated detectors. The detector should also work 
under harsh conditions of high-rate collisions. Modern statistical and analysis meth- 
ods and well-modelled simulation of physical processes and detectors should be 
pursued with support from the rapid advancement of computing powers. Last but not 
the least, the sensitivity of the experiments ultimately relies on ideas on data analysis 
based on the human understanding of the physics processes in concern. We believe 
it even if smarter artificial intelligence is born. 

The following chapters of this book provide comprehensive explanations on each 
of the key elements in the high-energy experiments and data analysis, aiming for 
helping to understand on the above-mentioned subjects. Chapter 2 starts with an 
overview of how a collider detector is designed to measure particles and the data 
are recorded. Also explained is how we extract the physical quantities of interest 
out of the data analyses. Chapter 3 covers the collider facility, detector in general 
and data-taking system. Chapter 4 gives basic overview of statistics used in high- 
energy physics. Chapter 5 describes detector calibration procedure, followed by 
particle identification in Chap. 6. Chapter 7 is devoted to the explanation of how the 
simulation of the physical process of collisions is taken place. All these chapters are 
followed by “exercise” parts in Chap. 8, where we give an explanation of the physics 
data analysis and results of journal papers as examples on how the reconstructed 
events are utilised to extract physical properties. They are measurements of Higgs 
production cross sections through its decay to two photons, a bb, or a W*+ W~ pair. 
Searches for new particles are also explained. 
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Basic ldea of Measurements in Particle 
Collisions 


2.1 Observables of Particle Scattering 


Some of the readers of this book may have seen a so-called “event display”, visual- 
isation of a particle collision. An example from the ATLAS experiment is given in 
Fig. 2.1. There, what we see are many curves from a particular point, which indicates 
the location of two particles colliding with each other. The curves emerging from 
the point are charged tracks, and the trace of charged particles are identified from 
detector responses. Also seen are many colourful boxes, which look like histograms, 
with the direction of the height of the histograms pointing to a radial direction. These 
indicate the amount of energy from particles produced via the collision, measured 
in particle detectors. 

We like to extract the properties of the physics processes of collision through 
such measurements. Ultimately, we like to know the property of the underlying 
interactions, i.e., how the particles are scattered and the final state particles are 
produced at the level of a few elementary particles involved. As you know, however, 
we have no way to know exactly how the elementary particles are interacting since 
it is through a process obeying quantum mechanics. All we can do experimentally 
is to measure the physical observables as precisely as possible in order to obtain the 
distribution of the observables. Such observables include the species of the particles 
produced by the collisions and the energy, momentum, and possibly the quantum 
numbers of the particles. Then we identify the underlying interaction of particles by 
comparing theoretical predictions of various interactions. 

Here, we explain how such a correspondence between the observable and the 
underlying process is realised by taking again an example of the reaction ete~ > 
ete” (Fig. 2.2), ine*e™ colliders. In the experiment, we can measure the existence 
of the final state particles, e* and e7, as well as their energy and momentum. 

Let us suppose that you have no knowledge of quantum electrodynamics (QED). 
Then you do not know how the electrons exchange force when they scatter, i.e. 
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Fig.2.1 Event display for an event, where the invariant mass of the two electrons found in the event 
is one of the highest among the events (up to 2016) taken by the ATLAS experiment. Reprinted 
under the Creative Commons Attribution 4.0 International license from [1] © 2017 CERN for 
the benefit of the ATLAS Collaboration. The black lines (or curves, more precisely) represent the 
charged tracks for the two electron candidates. The yellow curves correspond to other charged 
tracks. The red and purple boxes indicate energies measured in calorimeter units 
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Fig.2.2 Diagram for a reaction et e~ —> ete”. a the detail of the interaction is left unspecified. b 
An example of the lowest order Feynman diagram 


how the force is mediated. We often draw interactions of particles schematically in 
diagrams. The diagram is drawn with a blob at the vertex of four fermions, in this 
case either electron or positrons. The interaction of the objects inside the blob is 
not “visible”: although we can guess that some particles are exchanged to mediate 
the scattering force, they cannot be observed directly. All we can observe are the 
“observables”, the particles in the final state. 
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Now we like to know what is happening in the blob through experimental measure- 
ments of the scattering. If we observe just one event of the interactionete~ —> ete, 
there is not much more information than that such a kind of interaction exists. How- 
ever, if we observe many events with an ete" pair in the final state, we can also 
measure the distributions of the final state particles. This would give much more 
information on the interaction. The distribution can be angular, energy or, more 
generally, the (four-)momentum distributions. Not only the shape of the distribu- 
tion but also the integrated number of events for a given intensity of collisions (the 
luminosity, see the next section) tell us the “strength” of the interaction, i.e. how 
often such interactions should occur. For simple interactions like ete~ > ete7, 
the angular and momentum distributions allow us to extract the shape of the poten- 
tial between two scattering particles through appropriate transformation. For more 
complicated interactions like collisions with many particles produced, inclusive or 
exclusive distributions (see Sect. 2.3) of a particular type of particles can be com- 
pared to theoretical predictions. In this way, the blob of the diagram is uncovered 
and the Lagrangian of the interaction could be constructed, even though the inter- 
pretation is often limited by both experimental uncertainties from the measurement 
and uncertainties of the theoretical predictions. 


2.2 Cross Section and Luminosity 


The strength of the scattering has the dimension of L? (area), where L represents the 
dimension in length. This can be understood if we take an example of scattering in 
classical mechanics with one of the objects standing still (a target). Suppose that you 
hit the target of a certain size by throwing an object (a projectile) moving in a certain 
direction (Fig. 2.3a). If you assume that the object can hit only when they contact 
each other, and assume that the size of the projectile is negligible, the probability to 
hit the target is proportional to the size viewed from the direction where the projectile 
is running. More precisely, it is proportional to the cross section o of the object on 
the plane perpendicular to the projectile direction. For that reason, we also call the 
size of the interaction area the cross section. Whether the projectile goes inside the 
interaction area or not depends on the distance between the centre of the target and 
the line where the projectile flies. The distance between the target centre and the 
trajectory of the projectile at the infinite distance projected to the area of the target in 
the plane perpendicular to the projectile momentum is called the impact parameter 
b (see Fig. 2.3b). 

For point masses that interact remotely, which is the case for the elementary 
particles, the projectile and target are scattered infinitesimally even if their impact 
parameter is very large, if the force is long-distance, i.e. propagating in the infinite 
distance. This implicates that the cross sections for such an interaction are well 
defined as a function of a property of the scattering, such as the scattering angle 0 in 
Fig. 2.3. Differential cross sections can be defined using the variable: do /d@ in this 
case. 
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Fig. 2.3 a Target and projectile. b Scattering angle 0 and impact parameter b, where a projectile 
particle is injected towards a target particle 


The differential cross-section formula can be calculated using theory if the inter- 
action is known. If we observe any deviation in such a distribution from the predic- 
tion, this tells us that the particle is not point-like or there exist some new type of 
interaction. 

The number of observed scatterings for a given cross section can be calculated if 
we know the “strength” of the projectile beam. One needs to give as many particles 
in an area of as small cross section as possible, in order to maximise the number of 
events. This means that the particle flow density (1/a)dN, /dt is one of the param- 
eters to define the number of interactions. Here, a is the area where the projectile is 
injected, i.e. the size of the beam of the projectile. Np gives the number of projec- 
tiles passing through the area. Then the number of interactions N per unit time is 
given as dN /dt =o - (1/a)dN,/dt. You see that the cross section gives the correct 
dimension to deduce the number of interactions. 

To obtain the number of collisions, we need to also express the number of par- 
ticles in a target object via the density of the target. For the colliding beam, the 
corresponding number would be the density of the “target beam”. For the fixed tar- 
get, one should count the number of target particles within the projectile beam size a. 
Suppose that the depth of the target is D and the target number density per volume is 
n. The total number of targets N;, which could be scattered by the projectile beam, 
is N; = Dan;. It is useful to express the target density in terms of the mass density 
of material o when we discuss the interaction of a particle with material such as that 
consisting of detectors. The number of the target is then expressed as 


N. 
N; = Da Ê = 


where the targets are nucleus and p given is in g/cm’, A is the mass number given 
in g/mol, and N4 is the Avogadro number. The number of the interactions for a still 
target (for the case of the fixed target experiment) integrated over time is reduced to 


_ NpNio _ NpDopNa 


N 9 
a A 


assuming the target is larger than the beam size. 
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The number of collisions for the collider can be obtained by replacing the number 
of targets with the number of the beam particle. For the beam as a target, it is easier 
to use the number of target beam particles per unit time d N, /dt since the target is 
also not a still object. The beam particles are bunched in RF buckets (see Sect. 3.2) 
for almost all colliders. The number of particles per unit time crossing a given plane 
perpendicular to the beam dNbeam/dt is given as fooll * bunch Where feoi is the 
number of bunch collisions per unit time and npunch is the number of particles in a 
bunch. In an ideal situation where both the lateral bunch size a (dimension= L?) and 
the longitudinal distribution of particles inside the bunch are the same for both the 
target beam and the projectile beam, the frequency of collisions is given as 


dN f nın 
-n es II ; 
dt a ag 


where n; and nz are the numbers of particles in a bunch for beam | and beam 2—we 
no longer distinguish the target and projectile beams at this point. This implies that 
it is convenient to define the instantaneous luminosity L, 


nyn2 


L= feol . 
a 

Note that the dimension of L is L~2T~!. Then the number of collisions Neon for a 
given period with the cross section o can be obtained as 


Neol = o x f Lar, 


where f Ldt is called integrated luminosity. 


2.3 Identifying Processes Through Measurements of Final 
State Particles 


In this section, we discuss the way to identify if the observed events are indeed the 
ones of concern. You may regard the identification of events is just as simple as 
counting the number of events with a given final state, if we can safely assume that 
the detector works well enough to identify the particles. There are, however, a few 
points yet when we consider the “definition of the signal events.” 

Let us start with the example of ete~ — ete”. To see if an event is classified 
to this category or not, first we need to identify the type of the final state particles 
through the measurements by detectors. We call this procedure particle identification 
or particle ID. We discuss the technical detail of the particle ID in Chap. 6. Here, 
we simply assume that the final state particles are identified at certain probabilities. 
You would then count the number of particles of interest. In this example, you would 
request that the events should have one electron and one positron in the final state. 

A few questions come along: we may wonder if we should request certain criteria 
in energy or momentum for the electron and positron. Also, we need to decide if 
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we allow any other particle(s) in the final state. For example, since what we like to 
measure is the e'e~ — ete”, you may like to limit yourself to select the events 
which have only one pair of et e~ and no additional particles are observed. In order 
to have such a selection well-defined, we need to consider the following things. 

Firstly, the electrons can emit soft photons induced by an internal electromagnetic 
field induced by the interacting electron and positron themselves. This probability 
is very high for the electrons, whose mass is very small. Since such soft photons are 
anyhow not efficiently observed by the detector, experimentally you can define that 
the event contains only an ete" pair observed in the final state by requesting the 
other detector responses not associated with either of the electron or positron to be 
below certain thresholds (typically slightly above the noise level of the detectors). 
A photon collinearly emitted in the direction of the electron would still be difficult 
to separate from the parent electron. In that case, such collinear photons are often 
treated as a part of the parent electron and then the measured energy and momentum 
are considered to be those of the primary electron. 

The second reason is that none of the particle detectors using accelerators have 
4x coverage in the solid angle: the detector is not completely hermetic. This means 
that we may miss a part of final state particles even if it is hard enough to be detected 
since we may have holes in the detector in that direction. For example, the probability 
for an ete” collision to emit a photon in the direction of the incoming electron or 
positron is very high, again because of the small mass of the electron. Such events 
are called initial state radiation (ISR) events. It is not possible to catch the photon 
if the emission is at a very small angle from the incoming beam direction since the 
accelerator should accommodate the beam with a beam pipe with the finite diameter 
of typically more than a few centimetres. The photon escapes from the detector 
through the beam pipe. 

Therefore, all we can do to select an event is to impose criteria which look like an 
ete final state. For an ete” collider where the laboratory frame coincides with the 
centre-of-mass system of the two beams, which have the same energy, an example 
of such criteria would be 


e apair of electron-like particles in opposite charge, both within the angular region 
Omin + A < 0 < m — (Omin + A). Here, 6 is the polar angle with the z direction 
defined as the incoming direction of one of the beams, 6 min, which is the polar 
angle of the detector boundary of the hole to accommodate the beam pipe, and A 
is a margin to be taken so that the observed particles are enough away from the 
boundary; 

e no other track nor cluster observed in other parts of the detector with their momen- 
tum or energy greater than 0.05 Ebeam, where Epeam is the beam energy; and 

e each electron fulfils Eeee > O0.8E beam, Where Egjec is the energy of the electron 
or the positron. 


The first criterion makes sure that the events are well contained within the angular 
coverage of the detector Omin < 9 < 1 — Omin. The second criterion removes the 
events where extra particles with significant energies on top of the signal electron 
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Fig.2.4 The two-photon process in e*e~ collisions. The blob indicates an interaction of incoming 
yy, emitted from e* or e~, and outgoing final-state particles (multi-particle state). The final state 
often consists of many hadrons since the photon-photon collision has coupling either to a quark or 
a neutral vector meson, which contains a qq pair 


and positron exist within the acceptance of the detector. The threshold 0.05 Epeam 
may vary depending on the experimental condition. The last criterion reduces the 
probability that the e*e~ pair is produced in association with some other particles 
that escape from detection. This requirement would remove effectively events from 
so-called two-photon processes (see Fig. 2.4) where an et e~ pair is produced in the 
final state and both electron and positron lose significant energies. 

You may like to add further criteria to constrain the process to increase the fraction 
of the process in your mind among the event sample, so that the interpretation of thus 
defined cross section becomes more intuitive. This kind of event selection is called 
exclusive event selection. The above-given example would be called measurements 
of exclusive et e~ production. 

Another way to investigate the underlying physics of ete~ collisions through 
ete” final state is to define the selection criteria as simple enough. An extreme 
would be to just select an e+ e~ pair, both of which are energetic enough, like 


e a pair of electron-like particles in opposite charge, both within the angular region 
Omin < 0 < T — Omin; 
e each electron fulfils Eelec > 0.3 Ebeam, Where Eelec is the energy of the electron. 


This allows you to select events which include other particles in the final state; 
for example, the selected events contain, with a high probability, the process from 
Fig. 2.4. This kind of event selection is called inclusive event selection. The benefit of 
the inclusive event selection is that it is indeed “well defined” in terms of the theoret- 
ical prediction. For measurements with exclusive event selections, the modelling of 
the soft emission of many particles in the theoretical prediction (Monte Carlo simu- 
lation) is often not easy since it involves higher orders of the perturbative calculation. 
On the other hand, the total production rate, which is one of the measurements with 
inclusive event selections, is often calculated well since certain techniques exist to 
sum up all the contributions of final states. 

The inclusive measurements are often performed when one would not know the 
number of particles produced in the reaction, either it is not measured or it is not 
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easy to measure. Examples are jet production in hadron-hadron collisions, e.g. pp > 
n x jet + X, where n > 1 and X represents a part of the final state with any number 
of particles, or in deep-inelastic scattering eN — eX where N is a nucleon. 


2.4 Event Acceptance and Efficiency 


The event selection criteria will reduce the number of events you take from the process 
of your concern. The remaining number of events becomes smaller if the event 
selection becomes more exclusive, which is necessary if the amount of contribution 
from background processes is to be reduced. The solid angle coverage of your detector 
also gives a hard limit on the detection possibility. All these will cause the reduction 
in acceptance, defined as 


Nel 
acceptance) = P 
(accep ) N 


where Nse; denotes the number of events passing the selection criteria, imposed on 
true four-momenta of particles, while N is the number of events from the processes in 
concern. Here, the true four-momentum means that it is used in theoretical calculation 
without detector smearing. This may be available in event generators (see Chap. 7). 

The acceptance is often a very small number (like « O(10— 1)) if the event selec- 
tion is exclusive and the denominator is defined as all the events from the process. 
Instead, the acceptance may become closer to unity if it is defined for differential 
cross sections at a given point in the phase space both for the denominator and numer- 
ator. As an example, for the exclusive event selection criteria given above for the 
ete” final state, the cross section may be defined as double-differential cross sections 
d?a/dE\d6,, where E; and 6, are the energy and angle of the highest energy elec- 
tron. The acceptance for events with either 01 < Omin + A, 01 > m — (Omin + A), or 
E, < 0.8Ebeam is zero. That means that the differential cross sections corresponding 
to these kinematic regions are also zero. But for other regions, one would expect the 
acceptance is closer to unity than the average acceptance over all possible kinematic 
regions of the processes in concern. In this way, we can avoid a large extrapolation 
factor from Ne; to N. 

Since there is no detector with 100% detection efficiency, we necessarily lose 
events by the inefficiencies of the detectors. We often treat the loss of this effect 
to the acceptance separately from the geometrical acceptance and call it efficiency, 
defined as 


Naet (N sel 
Nsel 


where Naet f sei denotes the number of events passing the selection criteria imposed 
on quantities obtained from measurements, while Nse is the number of events passing 
the selection criteria imposed on true four-momenta of particles. 

Note that a detailed definition of acceptance and efficiency may be different from 
what is given above and may depend on physics analysis or literature. 


(efficiency) = 


’ 
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With the acceptance A and efficiency e€ defined, the number of signal events Nsig 
for collisions with integrated luminosity Lint can be derived from 


Nsig = Lino Ae. 


The cross section can be obtained from this equation. 

In most measurements, you cannot ignore the presence of the background events. 
The Nsig should be replaced with Nsig = Nobs — Nokga, Where Nops is the observed 
number of events and Npked is the number of background events. The estimation of 
background events is time-consuming in data analysis, which is discussed later. 

Note that it is often difficult to know Lint precisely enough in collider exper- 
iments, especially for hadron colliders. Also the deviation in detector calibration 
from its truth value causes shift in the number of observed events, often rather uni- 
formly across other kinematical variables (e.g. angles). The normalisation of a cross 
section, or differential cross sections, is important for extracting the strength of the 
interaction such as extraction of coupling constant and higher order effect on per- 
turbation calculation. The normalisation is, however, not necessary when extracting 
the physics quantities from the shape of the distribution, such as the mass spectra 
and the spin of the exchanged particles. A normalised distribution is used to extract 
physics quantities for such purposes. 


2.5 Nature of Hadron-Hadron Collisions and Kinematic 
Variables 


Hadron-hadron collisions are realised by two high-energy stable hadron beams 
brought into collisions. Only protons or anti-protons have been used in modern high- 
energy accelerators in practice. The proton is not an elementary particle; instead, it 
consists of partons, i.e. quarks and gluons. The processes that undergo in hadron- 
hadron collisions are categorised into two: soft and hard interactions. 

In soft collisions (Fig. 2.5a), the constituent of a proton is not resolved during the 
interaction of, for example, two protons. This occurs when the exchanged particle 
during the interaction does not carry high momentum. Such an interaction cannot 
be described perturbatively by quantum chromodynamics (QCD) since the strong 
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Fig. 2.5 a A schematic view of a soft pp collision with multi-hadronic final state. b An example 
of a hard pp collision with two high-pr partons in the final state 
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Fig. 2.6 The strong coupling 
constant as as a function of 
the renormalisation scale 

u = Q. Reprinted under the 
Creative Commons 
Attribution 4.0 International 
license from [2] © 2018 
CERN, for the ATLAS 
Collaboration. The data are 
taken from measurements 


2 Basic Idea of Measurements in Particle Collisions 


ATLAS R,, 

ATLAS TEEC 

CMS Mg 5.4 

CMS inclusive jets 
CMS Ry 

DØ Rar 

DØ inclusive jets 
ALEPH event shapes 
JADE event shapes 
ZEUS inclusive jets 
H1 incl. jets + dijets 


from hadron and ep 
colliders. The curve indicates 
the solution of the 
renormalisation-grou +0.006S 
MAISA LONE OAP: E  a,(m,) = 0.1127 “5.0027 
equation using as obtained 
from the results indicated by 
red (solid circle) points 


Ld 
A 
o 
* 
a 
a 
o 
° 
o 
A 
v 
9 


2 3 
10 10 10 
Q [GeV] 


coupling constant œg is in fact too strong for low-energy interactions. Figure 2.6 
shows the behaviour of ws (u2), where u represents the energy scale of the interaction, 
e.g. the four-momentum of the exchanged particle. Since as becomes much larger 
than O(107!) when the momentum transfer is similar to or smaller than Agcp = 
200 MeV, a perturbative expansion based on the number of partons is no longer 
possible there. In such a situation, the partons are bound strongly and the nucleons 
move collectively. Individual partons are no longer visible. We often call such an 
object “quark matter”, like a fluid consisting of quarks and gluons, which binds the 
quarks together. 

The soft interaction would look, therefore, like two pancake-like composite 
objects moving and crossing across. The collision of such objects may be simplified 
as follows: the two objects interact with each other if the compound has an overlap 
with the other composite material, and do not interact if the impact parameter is 
larger than the diameter of the objects. It is expected that the cross section for such 
an interaction is constant as a function of the centre-of-mass (CM) energy of the 
collision, assuming that the size of the proton is approximately independent of the 
interaction. Measurements at various CM energies show that the cross section rises 
with energy, but very slowly. 

On the other hand, the parton (here denoting the parton A) in a proton can resolve 
the parton (the parton B) of the other proton when the momentum transfer of the 
exchanged particle is much larger than Agcp (see Fig. 2.5b). If the impact parameter 
of the parton A with respect to the parton B is so small, the field produced by the 
parton B is strong so that the interaction may occur at a very high energy regime. 
In other words, the particle exchanged in such an interaction may carry very high 
momentum with a short wavelength. This allows the partons inside hadrons to be 
resolved. The as gets much smaller such that the interaction can be described by the 
perturbative calculation. In such hard interactions, the lowest order in perturbation, 
i.e. 2-to-2 interaction becomes dominant; two high-momentum (i.e. hard) partons 
are produced in the final state. 
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In order to give the kinematic feature of hadronic collisions, let us define the coor- 
dinate commonly used for hadron-hadron collisions. We choose the system where 
the z-axis is along the beam line, with the positive direction of the z-axis being the 
running direction of the beam A (like the proton beam for pp colliders or one of the 
proton beam rotating counter-clockwise for pp colliders). The x-axis is then chosen 
towards the centre of the accelerator ring. The y-axis is defined so as to form a right- 
hand coordinate system. The transverse plane is the xy plane perpendicular to the 


beam direction. The transverse momentum is defined as pr = ,/ p? + p y=P sin 0 


where 6 is the polar angle of the coordinate system given above. We often use (pr, @) 
instead of py and py, where ¢ is the azimuthal angle. 

A soft-interaction event is characterised by an absence of particles with high pr, 
while at least a few high- pr particles are produced in the hard collisions. The reason 
to use pr comes from the nature of the hadron-hadron collisions. The hard collision 
occurs between two partons, whose energies are not equal even if the hadron beam 
energies are the same. The momentum of the parton A (B) can be expressed as 
PA(B) = XA(B) Pbeam USINg a Momentum fraction xq (py), which is defined as the ratio 
of the parton momentum involved in the hard collision to the momentum of a hadron 
to which the parton belongs. In general xa A xp, meaning that the centre-of-mass 
frame of the two partons involved in the hard collision is boosted against the centre- 
of-mass frame of the two beams, which corresponds to the laboratory frame for 
symmetric colliders. Therefore, the only component of the momentum preserved in 
the hard collision is the one in the transverse plane. The longitudinal component of 
the centre-of-mass system of the partons A and B cannot be determined unless the 
values of x4 and xg are obtained by other experimental quantities. 

Now we may like to determine the third coordinate component p; of the momen- 
tum of the parton-parton collision system. However, it is not always possible to 
measure the third coordinate precisely in hadron-hadron collisions, since a large 
fraction of the longitudinal momentum is lost through particles entering in the beam 
pipe, and some of the lost particles may have emerged from the parton-parton colli- 
sion. The variable E — p; of an event, instead, can well be measured even if we lose 
particles lost in the beam pipe in +z, since E — p; contribution from such particles 
with very small scattering angle, escaping the beam pipe, is almost zero. Similarly, 
E + pz is also well measured even if we lose particles collinear to the —z direction. 
Constructing variables using E — pz and E + p; of a system (event or a part of the 
event), or a particle, would therefore be determined with certain accuracy. The most 
common and convenient choice is to use the rapidity y, defined as 


E+ E+ 
y=n ain aE my =m? + pr. 
Z 


A good property of the rapidity is that the difference in rapidity between two 
four-momenta is preserved under a Lorentz boost. This also means that the Lorentz 
boost can be calculated by adding the rapidity of the boost vector. 
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In a limit where a particle is massless, the rapidity is equal to pseudorapidity n, 
defined as 


n = — ln tan(0/2). 


n is a good approximation of y in modern high-energy collider experiments, where 
the particles from hard collisions are produced at > O(10) GeV, which is much 
higher than typical mass of long-lived final state particles like electron, muon, pion, 
or kaon. At 0 = x/2, dn/d0 = 1, i.e. An corresponds exactly to 40. 

Since the Lorentz transformation of the rapidity is additive, it expresses well the 
Lorentz-invariant phase space of final state particles. The phase space for a particle is 


d’ p5(p* — m°) = d’p/E = ndydp7 , 


where m is the mass of the particle and p is the three-momentum of the particle while 
p denotes the four-momentum. This means that the particle is uniformly produced in 
y if the particle is equally distributed in the phase space. A differential cross section is, 
therefore, expressed often as do /dy instead of do /d@. The latter is more commonly 
used in non-relativistic collisions, fixed target experiments, or et e~ collisions. 


2.6 Structure of Hadrons and Parton Density Function 


The hard process as introduced in the previous section regards a hadron-hadron 
collision as a scattering of one parton from the parent hadron A of its momentum 
fraction x4 = Pa/Pbeama, With another parton from the other parent hadron B, xg = 
PB/ Pbeamg- For high-energy collisions where the mass of the partons can be ignored, 
the centre-of-mass energy of the two partons A and B, V5, is given as /XAXBS. 

In order to estimate the cross section of hard collisions, one needs to know the 
“luminosity” of such partons, i.e. number of partons in the incoming beam particles, 
in order to convert the luminosity of hadron-hadron collisions to the luminosity of 
parton-parton collisions. The number density would depend on x of the parton since 
a hadron is a composite particle consisting of partons of various momenta. Now a 
slight complication is that the number of partons with a given x also depends on the 
wavelength of the probe. This is explained qualitatively as follows. 

One needs an electron microscope to see the structure of viruses since the virus 
is smaller than the wavelength of the visible light. The electron beam energy of the 
electron microscope (>> keV) is much larger than that of the visible light so that one 
can resolve the fine structure of the virus. Similarly, if we like to see the structure of 
hadrons, it is necessary to use a probing beam, whose wavelength is much shorter 
than the size of the hadron itself ~ 1 fm ~ 200 MeV. Practically, what probes the 
structure of hadrons is not the beam itself but rather a particle coupling to the partons, 
e.g. photons for the electron beam, or gluons, or quarks for hadron beams. 

Now, what is known is that more partons (the structure of hadrons) are seen as 
the wavelength of probe particles gets shorter, as schematically drawn in Fig. 2.7. 
In this figure, an electron as a projectile collides with a quark inside a target proton. 
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Such an interaction is called deep-inelastic scattering (DIS). The scatter exchanges 
a virtual photon y*, which mediates the force and probes the quarks inside the 
proton with a short wavelength if Q? = —( por Pe)? is large, since the wavelength 
is x /1/Q?. When the partons start to feel the external force from the electron 
beam, an energetic parton may radiate another parton before coupling to the virtual 
photon, since the coupling constant involved in the parton radiation, as, is larger 
than the electromagnetic coupling. This behaviour of the parton radiation is very well 
described by a theoretical framework based on perturbative QCD, for example, by the 
DGLAP equation (for review, see, for example, Ref. [3] or more in detail in Ref. [4]). 
The equation, with experimental data from lepton-hadron scattering experiments, 
tells that partons should increase with Q7, except for very high-x partons, which 
give momentum to low-x partons through the parton radiation. 

As a consequence, the number of partons, or the parton density function (PDF) fj, 
where i denotes the type of the parton (gluon or quark flavour), depends not only on 
x but also on the wavelength of the probe: f; = f;(x, Q*). For low-x(< 0.1) regime, 
the number of partons increases logarithmically with Q7. It also shows rapid increase 
as x of the parton gets lower (x < 0.1), f; x x~* where A is typically 0.2-0.4. An 
example of the parton density functions may be found in the review section for the 
structure function in the Particle Data Group review [5] and references therein. 

For hadron-hadron collisions, a projectile is a hadron and the exchanged force 
for the scattering is propagated also by a parton. The Q? of such collisions needs 
Pz Of the scattered parton, which is not well reconstructed. Instead, pr of the hard- 
scattered parton is used as the probing scale when estimating the parton density. The 
cross section of such hard collisions can be expressed as a product of the parton 
densities of the partons A and B and the scattering cross section AB — CD where 
the parton indices C and D are the two outgoing partons, as shown in Fig. 2.8: 
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The probing scale is called a factorisation scale up, which is pr in this case. 
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Fig. 2.8 A schematic 
diagram showing how hard 
scattering cross section is 
factorised into 
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This formula assumes that a parton from a hadron collides with a parton from the 
other hadron. This picture may not be valid if the number of partons inside a hadron 
is very large (in particular if the partons are from very low-x) or the scattering cross 
section is very large (e.g. due to large ws in low- pr regime). For such cases, more than 
one parton pair may cause scatterings. If the number of scatterings becomes so many, 
the interaction may not be described anymore by perturbative QCD. They may have 
to be described, at least partially, by a theoretical framework for soft interactions, 
which treats the entire hadron as one body for the interaction. 
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Apparatus 


In this chapter, we present an overview of the apparatus used in high energy physics. 
First, we describe how the particle collisions with high energy are obtained, and then 
how the reaction of the particle collisions is recorded as the experimental data. 


3.1 Particle Collisions at High Energies 


The high energy particle physics (in short, the high energy physics) is the research to 
reveal the ultimate constituents of the universe and the rules obeyed by elementary 
particles by experimentally observing particle reactions. Since the size that can be 
probed is determined by the de Broglie wavelength, A = h/p, the availability of 
the high energy particle is essential to investigate more microscopic worlds. At the 
same time, momentum transfer produced at particle-particle collisions can be used to 
generate another particle, which is different from the ones before the collision. This 
implies that the higher the momentum transfer, or the higher the collision energy, 
the heavier the particles that are generated. For these reasons, we have been using 
particles with high energy in the past and we will in the future. The increase of the 
collision energy is the history of the high energy physics. 

In the ancient days of the high energy physics, only cathode rays or particles emit- 
ted from radioactive materials are available as a source of the particles for research. 
The naming convention of a, 8, y, ..., reflects such history. As time went by, physi- 
cists discovered cosmic rays and started to use them as a source of particles. This 
is still widely used in modern high energy physics. Neutrino experiments under- 
ground are typical examples; there are so many facilities to detect and study cosmic, 
solar, and atmospheric neutrinos. In the meanwhile, physicists succeeded in build- 
ing accelerators that allowed them to study artificially produced subatomic particles. 
Many new particles were discovered and investigated by accelerator experiments. 
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The main subject of this book is, however, the accelerator experiments, or more 
specifically the collider experiments. Below, we concentrate on the topics of the 
collider experiments: accelerators and detectors. 


3.2 Accelerator 


There have been various types of accelerators in the world. Here, we describe a large 
hadron collider (LHC) [1] at CERN as an example of a large accelerator complex. 
LHC consists of several different accelerators shown in Fig. 3.1. 

The LINAC is the first accelerator that accelerates protons or actually H7 to 
150 MeV. The electrons of H~ are stripped off just before the injection to the 
Booster, and H~ become the proton. The Booster accelerates protons to 1.4 GeV 
and sends them to proton synchrotron (PS) where the protons increase their energy 
until 25 GeV. The protons are further accelerated by super proton synchrotron (SPS) 
to 450 GeV, and then finally injected into LHC. The proton energy can be increased 
to 7 TeV in the design of LHC, but the largest achieved energy so far is 6.5 TeV as 
of writing this book (in 2021). As we have just seen LHC as an example, it is very 
common that the large collider complex consists of several accelerators. 

The beam energy is one of the most important parameters in the collider exper- 
iments, which is related to the potential to generate heavy particles, the interaction 
cross section, and so on. It is a long-standing tradition that the physicists have been 
looking for something new in the particle reaction initiated by the highest energy 
collisions. In fact, the history of discoveries in the high energy physics is the his- 
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Fig.3.1 CERN accelerator complex [1]. Reprinted under the Terms of Use from [2] [2] © 2013-2022 
CERN. All rights reserved 
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tory of the accelerators where the beam energy has been increased. Higher energy 
machines have enabled us to “see” the subatomic world with higher resolution, and 
generated heavier objects, such as the Higgs boson. 

On top of that, the luminosity of the particle collisions is another key parameter 
of the experiment. The higher the luminosity, we can accumulate more data per unit 
time. This allows us to improve the precision in the view of statistical uncertainty, 
or to search for rarer events such as particle decays with small branching fractions. 

Particles inside the synchrotron such as LHC are accelerated by radio frequency 
waves (RF) that are generated by a so-called RF cavity. Therefore, only the particles 
that are located in an appropriate phase of the RF can be accelerated. If not, they 
are decelerated and cannot be in the orbit of the accelerator. We call the cluster of 
particles spaced by the RF a “bunch”. LHC is operated with the 40 MHz bunch 
frequency, and hence the bunch crossing occurs every 25 ns if all the bunches are 
filled with protons. 

While explaining above, we have paid careful attention, i.e. we have properly 
used two terms: the bunch crossing and the collision. The bunch crossing means that 
two clusters of particles cross at a small space region; some particles are interacted, 
which is the particle collision. In the LHC, that is, proton-proton collisions, if the 
number of protons in each bunch is small, or the proton beams are not squeezed 
enough, particle collisions would not occur so frequently, although the frequency 
of the bunch crossing is 40 MHz. Such a situation is said to be a “low luminosity”. 
There is a different story: the cross section in electron-positron collisions is much 
lower than that in proton-proton collisions, and hence the particle collisions may 
not occur at every bunch crossing even with the high luminosity electron-positron 
collider. This is true for the KEKB experiment, for example. In the LHC, however, 
the bunch crossing is almost identical to the proton-proton collisions. Let’s discuss 
a concrete example. Assuming the proton-proton collision cross section to be 80 mb 
and the instantaneous luminosity of 2 x 10°4 cm~*s7!, the number of collisions 
or interactions per unit time is 16 x 108. Let’s also assume that this luminosity is 
achieved with the 25 ns bunch spacing, which is close to the case in the actual LHC 
running in 2018. Based on these assumptions, the number of average interactions per 
bunch crossing is 16 x 108 x 25 x 107? = 40. In reality, the number of protons are 
not uniformly spread across the bunches. Also, there is a statistical fluctuation from 
the average value. But one can imagine that it is very rare to have zero interactions for 
a bunch crossing. Therefore, we will use the words “bunch crossing” and “particle 
collision” or “particle interaction” with the same meaning later on if there are no 
confusions. 


So far, we have just discussed the colliders. In addition, there are other types 
of accelerator experiments, the fixed target experiments. As shown in Fig.3.1, for 
example, the Booster, PS, and SPS are used for various fixed target experiments. 
Particles accelerated by the accelerators are extracted and injected to a fixed target, 
instead of being collided with each other. With these types of experiments, it is much 
more difficult to increase the centre-of-mass energy than the colliders, but it is much 
easier to have high rate interactions at the target due to the large size of the target. 
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For this reason, the fixed target experiments are suitable for getting high statistics, 
and widely used for rare decay searches. 

There is a concept of “spill” at the fixed target experiment, which doesn’t exist 
in the collider. In the case of the collider experiments, the particle-particle collisions 
last until the luminosity becomes low because of the beam lifetime. On the other 
hand, all the particles inside the accelerator are extracted in a certain amount of 
time, the order of seconds or minutes, at the fixed target experiments. Once all the 
particles are extracted, new particles are injected through the injector chain to the 
main accelerator. This cycle is repeated in the fixed target experiments. Therefore, 
the beam is only available for an experiment when the particles are extracted and hit 
the target. This period is called “spill”. 


3.3 Detector 


The particle collisions induced by the collider, fixed target, or cosmic-ray experiments 
need to be captured by some means. Many particles are produced in these collisions. 
Some people look for new particles, new decay chains, new patterns in the event 
kinematics, and so on, which are not discovered yet. Others try to measure the 
rates of specific reactions such as cross sections or branching ratios of particles. 
In any case, we want to detect all particles produced by particle reactions and to 
measure their trajectories, and energies (and flight times if necessary) as precisely as 
possible. In this regard, geometrical acceptance, detection efficiency, and resolution 
on measurements are the important figures of merit in considering detectors. 

Below, let’s take a close look at tf pair-production events, where a pair of the 
top and anti-top quark is produced, as an example to see what we have to detect 
and measure in high energy experiments. As the top quark immediately decays to 
b-quark and W boson with the probability close to 100%, a tt pair becomes two pairs 
of b and W without leaving any trace of top quarks in detectors. The W boson decays 
to eVe, HVu, Or TV, With the probability of about 11% each, and a quark-anti-quark 
pair with the probability of about 66%. The former is called a leptonic decay and the 
latter a hadronic decay. This results in three types of final states. 


Both W decay leptonically, called a dilepton or two-lepton channel. 
One W decays leptonically, and the other hadronically, called a lepton+jet or 
one-lepton channel. 

e Both W decay hadronically, called an all-hadronic or no (or zero)-lepton channel. 


We use the lepton+jet channel as a further example in order to describe the particle 
detection, because the variety of particles in the lepton+jet channel is more than that 
in the other two channels. 

Here, let’s assume one W decays to eve or yvy, and the other hadronically. Then 
the tf final state consists of two b-quarks, one electron (muon), one ve (vu), and 
two more light (u, d, c, or s) quarks. The experimentalists want to detect all these 
particles, and hence try to make the detector more hermetic, i.e. the larger solid angle 
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coverage with respect to the particle interaction point is more preferable. The next 
question is how to detect and identify all these particles. In the following, we provide 
an overview of the basics of how each type of particle interacts with materials or 
detectors, and how they are captured. The detail of the particle identification will be 
discussed in Chap. 6. 


3.3.1 Particle Interaction with Material 


3.3.1.1 Electron and Photon 

Electrons with the energy of our interest create electromagnetic showers immediately 
after hitting dense materials such as calorimeters. Hence, the energy and the position 
can be measured at calorimeters by sensing the energy deposit of electromagnetic 
showers. At the same time, most of the detectors used in high energy physics have 
a device to measure trajectories of charged particles, which allows us to measure 
the momentum in conjunction with the magnetic field provided by a magnet. This is 
called a magnetic spectrometer. In addition, the precise tracking of charged particles 
provides the information to find a particle collision point in the collider experiments, 
and helps particle identification which will be described later in detail. 

In addition, photon is a very similar object to electron in terms of detection in high 
energy regime because photon hitting a material also creates electromagnetic shower. 
So electromagnetic calorimeter usually measures the energy of both electrons and 
photons. But there is an important difference, i.e. photon is a neutral particle and 
hence no track is detected with the charged particle tracking system. This difference 
is actually used to distinguish photon from electron. 


3.3.1.2 Muon 

Because muons with their momenta under our interests do not make electromag- 
netic showers in materials, their energy deposits are almost only by the ionisation of 
detector materials. This feature allows us to discriminate muons from other charged 
particles by placing enough materials, which are usually a part of the detectors such 
as calorimeters and/or solenoid magnets. In most collider experiments, there are two 
charged particle trackers, one located near the particle collision point, and the other 
after dense materials such as calorimeters. Particles detected after the dense mate- 
rials can be identified as muons with a high probability. By connecting trajectories 
measured by the two trackers, one can assure the muon really comes from the particle 
collision point. 


3.3.1.3 Quark (or Gluon) 

Any quarks produced by particle collisions or decays from the other particles are 
immediately hadronised, except for top quarks, because of the feature of QCD (see 
Sects. 2.5 and 6.4.1). In the tf events, there are two b-quarks produced by the decay of 
top quarks, and two light quarks decayed from W. All four quarks are metamorphosed 
to hadrons. The number of hadrons that emerged from a single quark mostly depends 
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on the energy of the original quark. As higher the energy, more hadrons are emerged. 
Because the top quarks or W bosons are much heavier than light quarks, many 
(O(10)) hadrons are formed for each of the four quarks, which are aligned to the 
direction of the momentum vector of the original quarks. This cluster of such particles 
is called a jet, which will be explained in Sect. 6.4. 

It would be ideal to measure the momenta of all the particles inside a jet. How- 
ever, the magnetic spectrometers cannot detect neutral particles such as photons and 
neutrons. Because a z? meson immediately decays to two photons with a branching 
ratio close to 100%, two photons need to be detected. This fact leads us to a tradition 
that the energy and direction of jets are measured by calorimeters. More specifically, 
hadrons are measured by a combination of electromagnetic and hadronic calorime- 
ters behind, in contrast to electromagnetic showers such from electrons and photons 
that are detected by electromagnetic calorimeters. 


3.3.1.4 Neutrino 

The cross section of neutrinos to interact with materials is too low to detect. Except for 
dedicated facilities for neutrino experiments, the nominal collision experiments can- 
not detect neutrinos, causing “missing energy”. In the electron-positron symmetric- 
energy colliders, for example, the momentum and energy of the initial states is well 
defined, i.e. the sum of momenta is zero. The momentum conservation allows us to 
deduce the momentum vector of neutrinos from the missing momentum, assuming 
the detector is hermetic enough. 

One needs to modify the above idea slightly for hadron colliders. A proton consists 
of many quarks and gluons, i.e. partons, in the picture of the high energy physics. 
What actually collide with each other in proton-proton colliders, for example, are 
partons in protons, not protons themselves. This means that even at symmetric-energy 
hadron colliders, the actual energy used for a collision is asymmetric, because the net 
energy of colliding partons varies event-by-event, and there is no principle or law that 
forces two colliding partons to have the same energy. Humankind does not predict 
which partons actually collide with each other and how large energy they have event- 
by-event basis, even though we can know the momentum of the protons. Therefore, 
the momentum of the beam direction cannot be used at the hadron colliders. The 
momentum conservation law can be used only for the plane perpendicular to the 
colliding beams. Here, we ignore the Fermi motion of the partons inside protons 
because its energy is negligibly small compared to the colliding beam energy. Thus at 
hadron colliders, neutrino momenta can be measured only on the plane perpendicular 
to the beam, called “missing pr” or “missing Er”, which could be a vector (x, y 
components) or a scalar (the magnitude of a vector) depending on the context. 


3.3.2 ATLAS Detector 


As we have just seen what kinds of particles and what properties need to be detected, 
we next discuss how they are detected. The layout or configuration of a multi-purpose 
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detector for high energy physics is common for many experiments, because if you 
want to detect all kinds of particles, the layout would become unique based on the 
nature of the interaction of each particle. The most inner part is covered by a charged 
particle tracker with the material as low as possible so that all particles can penetrate 
the tracker and reach calorimeters for energy measurements. Because radiation length 
is much shorter than interaction length, i.e. an electromagnetic shower evolves much 
faster than a hadronic shower, an electromagnetic calorimeter is placed in front of a 
hadronic calorimeter. A muon is identified by the fact that it is rare to make either 
electromagnetic or hadronic shower in our energy region, and hence it penetrates 
through massive materials such as the calorimeters. Therefore, a muon detector 
is located on the outermost part of a whole detector system. To summarise, the 
order of detector elements tends to be a charged particle tracker, an electromagnetic 
calorimeter, a hadron calorimeter for energy frontier experiment, and a muon detector 
from inside to outside. 

Since the concept is common for most of the detectors, we use the ATLAS detec- 
tor [3] in the following as an example to introduce the actual detector. Figure 3.2 
shows the ATLAS detector consisting of a barrel and two endcap parts. Each bar- 
rel and endcap is actually a collection of various detector components, which will 
be described later. There is a beam pipe penetrating the middle of the detector to 
make the proton beams run through it. In addition, there are Solenoid and Toroid 
magnets to provide a magnetic field, allowing to measure the momentum of charged 
particles. The Solenoid locates between the charged particle tracker and the elec- 
tromagnetic calorimeter, and the Toroids outside the hadron calorimeter, covering 
high-|7| regions. The field strength by the Solenoid is 2 Tesla. The integrated field 
strength by the Toroid varies from 2 to 9 T-m depending on || and @. 


Tile calorimeters 


LAr hadronic end-cap and 
forward calorimeters 


Pixel detector 
Toroid magnets LAr electromagnetic calorimeters 
Muon chambers Solenoid magnet | Transition radiation tracker 

Semiconductor tracker 


Fig. 3.2 Overview of the ATLAS detector [3]. Reprinted under the Terms of Use from [4] ATLAS 
Experiment © 2008 CERN. All rights reserved 
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Surrounding the proton-proton interaction point is the charged particle tracker 
consisting of the pixel and strip-type silicon detectors; each is referred to as pixel 
and semiconductor tracker (SCT), respectively. All the charged particles such as 
electrons, charged pions, muons, and so on interact with the tracker materials, and 
lose their energy, resulting in the creation of electron and hole pair inside the silicon 
sensor. These holes and/or electrons are collected by the electric field inside the 
sensor to the electrode, amplified, and recorded as the signal of the particle hit. The 
pixel and strip detectors have many layers of sensors, enabling us to “reconstruct” 
the particle trajectory by connecting the space hit points in many layers. 

Outside the silicon detector is another tracking device of charged particles, con- 
sisting of many transition radiation tubes, referred to as transition radiation tracker 
(TRT). The mechanism to detect charged particles is similar to the silicon detectors. 
Each tube of TRT is filled with gas which acts as the sensor instead of silicon. The 
charged particles passing through the gas create ion and electron pairs that are read 
out as a signal through the electrode, either cathode or anode wires. In the case of 
the silicon detectors, usually they have fine pixel or fine pitch of the strips to achieve 
good space resolution, typically the order of 10 or 100 um. On the other hand, the 
gas-based tracking device such as TRT uses time information on top of the discrete 
hit information collected by wires. By knowing the drift time of the ions and/or 
electrons in the gas, one can deduce more precisely the location of the particles 
interacting with gas by recording the time of signal arrival. Although the typical size 
of the tube is the order of mm, O(100 um) position resolution can be achieved. 

Most of the charged and neutral particles penetrate the tracking detectors, and 
hit into the electromagnetic calorimeter composed of the sandwich structure with 
lead and liquid argon (LAr). Electrons and photons develop the electromagnetic 
showers mainly at the lead, which is called “absorber”. The electrons created by the 
shower deposit their energy in the LAr, inducing the electric signals that are recorded, 
which is called “detector”. The total radiation length (see Sect. 6.2.1) is more than 
24X, (depending on |n|), which is large enough to terminate the electromagnetic 
showers, leading to precise measurements of the energy. In addition to the energy 
measurement, the segment of the calorimeter allows us to identify the location of the 
electrons or photons hitting into the calorimeter. 

The hadron calorimeter is located outside the electromagnetic calorimeter. There 
are some varieties in the detector types depending on their locations, but the com- 
mon concept, also used for the ATLAS hadron calorimeter, is to use a sandwich 
structure made from the absorber and the active region (detector). The barrel region 
uses iron as the absorber, and plastic scintillators as the sensor to detect the energy 
deposit of particles created by the hadron showers. A scintillation light is detected 
by photomultipliers through the wavelength shifting fibres. The total interaction 
length! is roughly 10 Ag. Only muons and neutrinos in the SM can penetrate the 


1 The (nuclear) interaction length Ao (Aq, Aint, etc.) is a useful parameter for the hadron showers, 
which represents the mean free path for inelastic collisions. The basic idea is similar to Xo. In 
practice, 49 can be expressed by 35A!/3, where A is the atomic number and its unit is g/cm? or cm: 
ào for iron is 17 cm. Typical hadron calorimeters have about 10 Ao. 
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hadron calorimeters except for punch-through hadrons. In case particles could not 
be stopped by the calorimeters, that is, their showers are leaked behind, such particles 
could be detected by other detectors (practically muon detectors). Such particles are 
called “punch-through” ones (punch-through hadrons). 

Following the hadron calorimeter, the outermost layer of the ATLAS detector is 
the muon spectrometer consisting of monitored drift tube (MDT) and cathode strip 
chamber (CSC) for precise tracking, and resistive plate chamber (RPC) and thin gap 
chamber (TGC) for providing fast signal for triggering. These all are the gas detectors 
like TRT, which allow us to measure the particle passage. The position resolution of 
MDT and CSC is the order of 100 um. On top of providing fast signals to form a 
trigger, RPC and TGC determine the event timing, which means that these detectors 
resolve in which bunch crossing the interaction occurs. Thus, the timing resolution 
is required to be high for these detectors. 


3.3.3 Trigger 


The total inelastic cross section of the proton-proton collisions is about 100 mb at 
/s = 14TeV in the LHC. When the instantaneous luminosity of the LHC accelerator 
is reached at 2 x 1034 cm~*s~!, the rate of the inelastic proton-proton interaction is 
expected to be about 2 GHz. Since the frequency of the bunch crossing in the LHC is 
designed as 40 MHz, we expect 50 proton-proton collisions in every bunch crossing 
as discussed in Sect. 3.2 that is called pile-up events. As the instantaneous luminosity 
goes up with the fixed rate of the bunch crossing, the number of the pile-up events 
increases more. On the other hand, the event rates of physics of interest, such as the 
production of the Higgs boson, are expected to be the order of 1-10 Hz or much 
less, depending on physics processes, as shown in Table 3.1. Thus, the inelastic cross 
section is huge so that even if events are produced from interesting physics processes, 
they are overlapped with lots of pile-up events. 

The total number of channels of the ATLAS detector is about 2 x 108. The 
detector sends 40 MHz x 2 x 108 ~ 10!® bits ~ 10!> bytes (1 Peta bytes) data 
every second, in case each of the channels sends a binary digit every collision. 
Although the data size per event can be reduced by a factor of about 100 using the 
noise-like data suppression and a bunch of zero-data suppression techniques, it is 
still inefficient to record all data of the proton collisions into the data storage system. 
Before accumulating data of an event into the data storage, its event is analysed online 
and a decision is made whether or not to keep the event for later offline study. This 
process is called “trigger”. The current ATLAS trigger and data acquisition (DAQ) 
system is based on two levels of online event selection, called level 1 trigger (L1 
trigger) and high level trigger (HLT), respectively, as shown in Fig. 3.3 [5]. 


3.3.3.1 Level 1 Trigger 
The L1 trigger makes an initial selection based on a huge amount of electronic mod- 
ules (printed circuit boards equipped with application-specific integrated circuits 
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Table 3.1 The rough cross section and the event rate for typical processes. The centre-of- 
mass energy and the instantaneous luminosity in the LHC are assumed to be 13 TeV and 
2 x 10°4 cm~?s7!, respectively 


Process Cross section Event rate 
W boson production 190 nb 3.8 kHz 
Z boson production 60 nb 1.2 kHz 
Top quark pair production 850 pb 17 Hz 
Higgs boson production 60 pb 1.2 Hz 


Detector 
Level-1 Calo Read-Out 
Preprocessor 


nMCM 


Level-1 Muon 


Endcap Barrel 
sector logic | sector logic 


MUCTPI ROD ROD ROD 


| 


Read-Out Sys 


Level-1 Accept 


+ \cTPCORE 
CTPOUT 


Level-1 


Data Collection Network 


Pixel/SCT 


Fast Tracker 
(FTK) 


Data Storage 


Fig.3.3 ATLAS trigger and DAQ scheme. Reprinted under the Creative Commons Attribution 4.0 
International License from [5] © CERN for the benefit of the ATLAS collaboration 2017 


(ASICs) and field-programmable gate arrays (FPGAs), forming multi-chip mod- 
ules) and their interconnections using information with reduced granularity as inputs 
from a subset of detectors. There are two main L1 trigger systems in ATLAS. The 
L1 calorimeter trigger (Level-1 Calo in Fig.3.3) is based on reduced-granularity 
information from electromagnetic and hadronic calorimeters, and searches for the 
high pr electrons and photons, jets, and taus decaying into hadrons, as well as large 
missing and total transverse energy. The L1 muon trigger (Level-1 Muon) is based 
on information from so-called trigger chambers; resistive plate chambers (RPC) in 
the barrel and thin gap chambers (TGC) in the endcaps, and selects high pr muons. 
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The number of objects such as muons, electrons and photons, jets, and taus above 
the set of threshold of pr or Er (for example, the threshold of the muon momentum 
is set as 6, 10, 15, 20 GeV, and so on) in the fiducial region are counted and sent to the 
L1 central trigger processor (Central Trigger). The L1 trigger provides “region-of- 
interest (RoI)” information including position (7 and @) and pr range of candidate 
objects for the input of HLT. In the case of the trigger based on the missing and 
total transverse energy, the information on whether an event passes through the 
criterion of the threshold is sent to the L1 central trigger processor. The central 
trigger processor makes an L1 trigger decision based on the combination of objects 
required in coincidence or veto and provides the signal of the “L1 accept” (Level-1 
Accept). The L1 trigger makes a trigger decision within about 2.5 us and reduces 
the event rate from 40 MHz to 100 kHz. During the process of the trigger decision, 
information for all detector channels has to be retained in “pipeline” memories, 
which are placed on usually front-end electronics systems of the detectors (FE in 
Fig. 3.3). The depth of the pipeline memories depends on the size of data per event, 
the frequency of the trigger latency.” 


3.3.3.2 High Level Trigger 

Only events selected by the L1 trigger are read out from the front-end electronics 
systems to the readout systems (ROS). Further trigger selections are done by the 
HLT. The HLT makes a more precise selection based on a huge amount of processors. 
Using the Rol information, the HLT selectively accesses data from readout systems. 
Typically, only data from a small fraction of the detector, corresponding to Rol 
information provided by the L1 trigger, are needed by the HLT. Hence, usually 
only a few per cent of the full event data are required for the event processing. The 
HLT makes use of information from muons, electrons, photons, jets, taus decaying 
into hadrons, missing and total transverse energy, and the charged particle tracks 
provided by the inner tracking system. More specifically, combination of pr or Er 
of the objects above, and topologies of events such as invariant mass and angles 
between the objects are used for a decision of the HLT. Only events accepted by the 
HLT are recorded in the data storage. The HLT reduces the event rate from 100 kHz 
to a few kHz. 


3.3.3.3 Trigger Requirements for Selecting Physics Events 

The trigger should reduce the data while keeping candidate events for further physics 
analyses. The target physics can be the SM process including the production of Higgs, 
W and Z bosons, and searches for signatures beyond the SM such as supersymmetry 
or other theoretical models. The trigger needs to cover all signatures for these target 
physics processes using electrons, photons, muons, jets, taus, b-jets, and missing 
transverse energy. A few thousands of different trigger conditions are prepared, and 
the list of these triggers is called a “trigger menu”. The trigger menu is frequently 


? Thanks to the progress of the high-speed optical transmitter, hit information from all detector 
channels can be transmitted to the electronics modules on a counting room. 
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updated depending on the accelerator conditions and physics of interest. Practically, 
before starting your physics analysis, you need to design the trigger condition to 
store events of your interest adequately while keeping the trigger rate of background 
events low enough. 

For example, the candidates of Higgs production followed by the decay of H > 
ZZ* — ppp can be collected by a combination of the L1 muon trigger with 
pr > 15 GeV threshold and the HLT with pr > 20 GeV threshold. In this case, the 
trigger efficiency is high enough for muons reconstructed to be really above the “turn- 
on curve” i.e. pr > 15 GeV for L1 and pr > 20 GeV for HLT, while the efficiency 
is low if the muons are well below these thresholds (Fig. 3.4). If at least one muon out 
of four muons from Higgs decay passes through the fiducial detector volume and has 
pr > 20 GeV, this kind of event can be kept for later physics analysis. Background 
events from the inelastic proton-proton interaction with a lot of low pr particles, 
mostly hadrons, may be effectively rejected by the muon trigger with the high pr 
threshold. However, there are background events that are not removed by the trigger, 
where a charged hadron is misidentified as a muon, a low pr muon is mismeasured 
as a high pr muon, or a few low pr tracks are combinatorially reconstructed as one 
high pr muon. As discussed at the beginning of this section, since the cross section of 
the inelastic proton-proton interaction is very high compared to that of the interesting 
physics processes in most cases, the trigger rate can be dominated by background 
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events even though the misidentification and the mismeasurement of muons are rare. 
The trigger rate needs to be monitored as a function of the instantaneous luminosity 
shown in Fig. 3.5 and controlled by optimising, for example, the threshold for pr of 


the objects in concern. 


Fig.3.5 The rate of the 
muon trigger. Reprinted 
under the Creative Commons 
Attribution 4.0 International 
License from [5] © CERN 
for the benefit of the ATLAS 
collaboration 2017 
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3.3.4 Optimisation of Detector Performance 


If the frequency of the event that needs to be recorded is low, the detector can 
be optimised for its resolution and/or efficiency. In order to achieve high position 
resolutions, for example, one might decrease the size of each pixel in the pixel detector 
for the tracking. However, this increases the number of channels to be read out, and 
might limit the DAQ speed, which should be improved if necessary. Therefore, the 
optimisation and compromise are necessary when designing a detector, and their 
balance depends on many constraints, for example, physics requirements, detector 
technologies, and some from budgets. 

The readers should be aware of the fact that not only detectors but also experiments 
themselves are strongly constrained by such boundary conditions in reality. It would 
be instructive to think about or imagine the constraints which are imposed on the 
detector under study, and why such a particular design was chosen. Such training 
will help to design and build your own detectors and experiments. 
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Statistics 


Statistics is a tool to show how accurate a measurement is. Particle physicists make 
use of statistics to know, for example, with how much probability the events under 
study are signal-like in a new particle search, how much improvement a new mea- 
surement gives compared with the past measurements, how powerful a new analysis 
method is, and how much a certain theoretical prediction is restricted. Statistics is 
also useful to discriminate the signal events from the background events and to esti- 
mate the number of the signal events and background events properly. Therefore, the 
basics of statistics are essential for experimentalists. 

As an example of how the statistics is used in particle physics, let us explain how 
the cross section of a certain physics process (Ophys) can be measured. As described 
in Chap. 2, Ophys can be extracted from five experimental observables: the number 
of observed events (Nobs), the number of estimated background events (Npxkga), the 
acceptance of the event selection (A), the detection efficiency (£), and the integrated 
luminosity (Lint) using 

r = (Nobs — Nokga) (4.1) 

phys = Lint AE i ` 

Nobs is measured by counting the number of events after the signal event selections. 
Npkea is estimated from data and/or the Monte Carlo simulation samples (MC sam- 
ples). In case we use data, it is often estimated from the fit to a distribution of a physics 
observable such as an invariant mass of particles. The geometrical acceptance and the 
detection efficiency of the signal events are often determined using a large amount of 
MC samples. Signal and background separation can be improved by the selections 
based on the likelihood method or the multivariate statistical analyses. In the above 
example, there are five observables and four of them (Noxkea, Lint, A, and £) have 
both statistical and systematic uncertainties but Nobs has only the statistical uncer- 
tainty. Systematic uncertainty is an uncertainty that arises from methods performed 
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in the detector calibrations, the data analyses, etc. In the end, the total uncertainty 
of the cross-section measurement is determined by propagating and combining the 
statistical and systematic uncertainties of the five observables. 

All of what is mentioned here require a good knowledge of statistics. This chapter 
describes the basics of statistics, which include uncertainties, the probability of the 
distributions, the propagation of the uncertainties, basic techniques of the fit to a 
distribution, and the basics of the maximum likelihood method. 


4.1 Uncertainty 


One can never know the true values of nature, but believe that there are such true 
values. This idea comes from a kind of frequentist’s viewpoint and is adopted in many 
analyses of collider physics. All we can obtain is the estimator for the true value based 
on the outcome of the experiment, i.e. measurement. Since perfect measurements 
can never be performed, the result of measurements is always represented by a centre 
value and its uncertainty. The centre value is often determined by the most probable 
value or the expectation value of the measurement. The spread of the probability 
distribution for the estimator of the central value is often used as the uncertainty, 
which usually consists of two kinds: the statistical uncertainty and the systematic 
uncertainty. Therefore, the measurement is usually expressed by 


(measurement) = (central value) + (stat. uncertainty) + (syst. uncertainty). 
(4.2) 
In this section, the basic concepts of the uncertainties are explained using the example 
of the cross-section measurements shown in Eq. (4.1). 


4.1.1 Statistical Uncertainty 


The statistical uncertainty arises from stochastic fluctuations of random processes. 
If an event observed is uncorrelated with events observed in the past and the future, 
the statistical uncertainty follows adequate probability distributions. Nops obeys the 
Poisson distribution (or the normal distribution if Nops is large enough), which is 
explained in the next section. The acceptance (A) and efficiency (£) follow the bino- 
mial distribution, which is also explained in the next section. The mean and the 
uncertainty can be obtained from these probability distributions. 


4.1.2 Systematic Uncertainty 


Suppose charged leptons are selected in measuring the cross section of a certain 
physics process. We must know the selection efficiency of the charged leptons to 
extract the cross section (Eq. (4.1)). The event selection efficiency, which includes 
all the efficiency of each selection stage, is usually estimated from MC samples. 
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Ideally, the efficiency obtained from MC samples is the same as the one in real 
experimental data (in short, real data). But who knows if it is true? In the real life, 
it is impossible to have the perfect correspondence between MC samples and real 
data. 

The differences must be evaluated, and corrected if needed. Let’s keep using the 
physics process selected by the requirement on charged leptons, as an example, where 
the selection efficiency is estimated from purely MC samples (defined as eMC). We 
must know the selection efficiency of the leptons in real data not in the MC samples. 
We correct the efficiency of MC simulation if eMC is not consistent with the one 
in real data. This correction can be obtained using an under-controlled data and/or 
well-known physics process (so-called control data), which are collected by the event 
selection without the lepton requirements. The caveat here is that we cannot use the 
selection which we want to evaluate when collecting the control data. For example, 
Z — L£, where £ represents charged lepton, can be selected by requiring the invariant 
mass of two charged tracks to be around 90 GeV. Assuming the purity of this control 
data is high enough for brevity, we can extract the efficiency of selecting the lepton 
for both MC simulation (eMC) sample and real data (e18 


cont cont?" 

Once we obtain e4# and eMC, a so- called Scale Factor (SF) is defined as SF = 
ae / cue Using SF, the efficiency used in Eq. (4.1) is corrected as £ = eMC x SF, 
The uncertainty of SF, which comes usually from limited statistics of the control data, 
is taken into account as the systematic uncertainty of the cross-section measurement. 
In addition, if eMC is different from eM“, (indication that the lepton selection effi- 
ciency depends on physics processes), the difference must be also taken into account 
as another systematic uncertainty. 

Here has been shown one typical example. Usually, we need to consider several 
different types of systematic uncertainties. The sources of the systematic uncertainty 
are, for example, poor understanding of jets, electrons, muons, and charged tracks 
reconstructions, mismodelling of the fitting function to discriminate the background 
events from the signal events, imperfectness of the theoretical model in the Monte 
Carlo simulation, and mismeasurement of the luminosity. The study of the systematic 
uncertainty is essential not only to estimate proper uncertainty of the measurement 
but also to help understand the details of detector responses and the dependence of 
what we are measuring on the other physics parameters. The study of systematic 


uncertainty sometimes also tells us the weakness of the analysis procedure. 


4.2 Probability Distribution 


Let’s assume you roll a dice which is truly a cube. The number n, which is an integer 
from 1 to 6, will be shown randomly and one can only predict the probability of the 
number on the dice to be n, which is denoted by P (n). In this case, the probability 
of n is the flat distribution of P (n) = 1/6. 

Now further assuming that you throw two cubic-dice, the sum of the numbers on 
two dice is the random integer from 2 to 12. The probability of P (n) that the sum 
of the numbers on two dice is n is obtained as a function of n as shown in Table 4.1. 
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Table 4.1 The probability that the sum of the number on two dice is n 


P(n) |1/36 | 1/18 | 1/12 | 1/9 5/36 | 1/6 5/36 | 1/9 1/12 | 1/18 | 1/36 


We cannot say what n happens in the next throw but know how large the probability 
is for each n. 

Similarly, in collider physics, itis impossible to predict what kind of event (physics 
process) will happen in the next collision. A human being can only know the proba- 
bility of having a certain event in the next collision. The number of observed events 
(n) in a fixed number of collisions follows a certain probability distribution (P (n)). 

Not only the number of events but also many other observable quantities such 
as energy or angle of a particle created by particle collisions, invariant mass recon- 
structed from particles, and so on follows certain probability distributions. Typical 
probability distributions which are often used in the particle physics experiments are 
introduced in the following subsections. 


4.2.1 Basics of Probability Distributions 


In the example of the dice described previously, the random number n is discrete 
and P(n) is the probability to have n. For the discrete probability distribution such 
as the dice, the probability of having from nj to ng is given by a F P (ni), if 
all possible random numbers are distributed as no, n1, n2, ..., nj, ..-, k, ...ne. If 
x is continuous, P(x) is a probability density function of x but is simply called a 
probability. The probability having x in the interval from x to dx is given by P (x)dx. 

The probability distribution is normalised to 1, so that the probability is defined 
to be between 0 and 1. If n; is discrete, 


k 
X Pi) =1 (4.3) 
i=0 
or, if x is continuous, 
i PORS (4.4) 


where the integral is over all eligible x. 

The expectation value (u) and the standard deviation (o) are often used as the 
measured value and its statistical uncertainty, respectively, in particle physics. So the 
result of a measurement is usually represented as u + o. 

The expectation value is obtained with the arithmetic average. The expectation 
value is defined to be 


k 
u= Y > nj P(ni) (4.5) 
i=0 
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if n; is discrete, or 


w= E[x] = [Pca (4.6) 


if x is continuous (the integral is over all eligible x). 

The standard deviation is the amount indicating how a certain measurement varies 
statistically from the average. The square of the standard deviation is called the 
variance and defined as 


k 
o? =D (ni — mW?’ P(ni) (4.7) 
i=0 
if n; is discrete, or 
o? = El(x — E[x])"]= f (x — W? P œ)dx = E[x?] — (Elx)? (4.8) 


if x is continuous (the integral is over all eligible x). 


4.2.2 Binomial Distribution 


When a particle passes through a certain readout channel of the detector, usually it 
provides a “hit" signal or a “miss" signal occasionally. The reason for “miss" may 
be due to, for example, geometrically dead-regions, dead-time of the readout, or 
an unexpectedly small gain of the charge from the interaction between the particle 
and detector material. The probability of “hit" or “miss” is given by the binomial 
distribution. Let’s assume the N particles pass through the detector. The probability 
of n “hits" and (N — n) “misses" in N trials is given by 


P(n) p” -— py” (4.9) 


~ aN —n)! 


where p and (1 — p) are the probability of “hit" and “miss" in a single trial, respec- 
tively. The sum of the P(n) from n = 0 to n = N is represented by the binomial 
expansion of the [(1 — p) + pl’ = |. This means that the sum of Eq. (4.9) is nor- 
malised to be 1. Figure 4.1 show the distributions for N = 20 with p = 0.1, 0.5, and 
0.8. 

The expectation value (u) and variance (o?) can be extracted by substituting 
Eq. (4.9) for Eqs. (4.5) and (4.7): 


pu = Np, (4.10) 
o? = Np(1— p). (4.11) 
The detailed calculation can be found in Appendix A.1. In the case of large N than a 


few 10’s and small p such as p < 0.1, the binomial distribution is approximated by 
the Poisson distribution of u ~ Np. In contrast, in the case of large N and moderate p 
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Fig.4.1 The Binomial distributions represented by Eq. (4.9) for N=20 with p = 0.1, 0.5, and 0.8 
such as p ~ 0.5, the binomial distribution is approximated by the normal distribution 
with the mean and variance expressed by Eqs. (4.10) and (4.11). The Poisson and 
the normal distributions are explained in the following subsections. 

4.2.3 Poisson Distribution 

When a theory expects u events of a certain physics process with some integrated 


luminosity at a collider experiment, the probability having n events obeys the Poisson 
distribution, expressed as 


P(n) = 


(4.12) 


Simply put, this distribution can be used in the case where rare events occur. In 
general, the Poisson distribution describes the probability of n events occurring in a 
unit interval of time if the events occur with a known average rate jz and independently 


in the time since the last event. Figure 4.2 show the Poisson distributions for u = 1, 
n 


5, 10, and 20. Using the Maclaurin expansion, i.e. e” = 5 = it is shown that the 
n! 


n 


sum of Eq. (4.12) is normalised to 1. 

By substituting Eq. (4.12) for Eqs. (4.5) and (4.7), both the expectation value and 
variance of the Poisson distribution are expressed with only one parameter u (see 
Appendix A.2): 


u = mu, (4.13) 
o? =u. (4.14) 


Then the standard deviation is expressed as o = ,/j. It means that you can use the 
square root of the number of events as a statistical uncertainty in counting experi- 
ments. In fact, the mean and the square of the standard deviation of the distributions 
in Fig. 4.2 are close to u in Eq. (4.12). But the mean is not exactly the same as the 
peak position due to the asymmetric shape of the Poisson distribution. If y become 
large, for example larger than around 10, the distribution is relatively symmetric and 
approximated by the normal distribution. 
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Fig. 4.2 The Poisson distributions represented by Eq. (4.12) for u = 1, 5, 10, and 20 
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4.2.4 Normal Distribution (Gaussian Distribution) 
The normal distribution, which is very often called Gaussian distribution, commonly 
appears in nature, and used by not only particle physics but also almost everywhere 


in science. The Gaussian function is symmetric and continuous. It is expressed by 
two parameters u and o, 


P(x) = l ex ( a= 2) (4.15) 
= Tas p 552 . i 


1 
V 200 


Eq. (4.15) from —oo to œo is normalised to 1.! The Gaussian distributions for 
u = 100 and o = 10, 20, and 30 are shown in Fig. 4.3. 


A coefficient of is a normalisation factor to ensure that the integral of 


: g —q2 72 
' Tt can be derived using f eo dt = 


—00 
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By substituting Eq. (4.15) for Eqs. (4.6) and (4.8), we can find that the parameter 
u is the expectation value and o? is the variance: 


a 1 (x — p)? 
i x- ome exp ( 72 Ja =u (4.16) 
oo _ 2 
Í (x — w. a exp ( oa = Jas =o, (4.17) 
—0o TO Oo 


For experimental measurements, the values u and ø are taken from the measured 
values and the uncertainty. 
The integral of the Gaussian distribution in range u + ø is 


Ut+o 1 = (x — u)’ 
E 20? 


Jas = erf (=) = 0.6827 (4.18) 


u—o 2mo 


where the erf (x) is called the error function defined by Eq. (4.19) 


es f "dx (4.19) 
~ sT Jo ` 


and is shown in Fig. 4.4. Equation (4.18) shows that in the measurement of x, the 
probability to have |x — u| < ø is about 68%. In other words, the probability of 
|x — u| > ø is 1 — 0.6827 = 0.3173 (32%). Several examples of the occurrence 
having |x — u| > 6 are shown in Table 4.2. 

Particle physicists use the expression such that a certain measurement has an 
excess of 5o from the background-only hypothesis. This means that the number of 
observed events is larger than the number of events expected from only background 
events, which is estimated at the 50, that is, such a measurement can occur with the 
probability of 5.73 x 1077 (for both sides) under the only background environment. 
This is very rare so we call it “observation” and/or “discovery”. In particle physics, 
we claim “evidence” and “discovery” of something new for 30 and 5o excesses, 
respectively. The full width at half maximum (FWHM) is also often used as the 
uncertainty rather than o. This can be easily translated to ø with the equation of 
FWHM=2¥/2 In 20 = 2.3550. 


4.2.5 Uniform Distribution 


The uniform distribution which represents the fixed probability in a certain interval 
of x is defined as 


1 
_ |e @sx <b) 
Oa {3 (otherwise). (4.20) 
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Fig. 4.4 The error function 


Table 4.2 The probability outside a certain range expressed in units of o 


ô Probability (both-side) Probability (one-side) 
lo 0.3173 0.1587 

1.640 0.101 0.0505 

1.960 0.0500 0.0250 

20 0.0455 0.0228 

2.3550 0.0185 0.00926 

30 2.70 x 1073 1.35 x 1073 

40 6.33 x 1075 3.17 x 1075 

50 5.73 x 1077 2.87 x 1077 


We can calculate the expectation value and variance of the uniform distribution by 
substituting Eq. (4.20) for Eqs. (4.6) and (4.8): 


b x 1 
w= [ xPooar | dx = -(a + b), (4.21) 
a b—a 2 
= T E EN a py} *_dx = L-a}? 
o= feu nar = ffs ger | ei a)“. 
(4.22) 


An important application of the uniform distribution is position measurements. The 
position where the particle passes through is determined by position-sensitive sen- 
sors. Let’s consider the detector with strip-shaped sensors aligned perpendicular to 
the x axis, which allows you to know the particle position along x. If a certain sensor 
with a width d, which has a sensitive area from x = a to x = b (d = b — a), pro- 


vides a hit signal, the expectation value and uncertainty of the position where the 
1 b — d 
particle passes through is estimated to be u = AG + b) and o = a 


V2 VI? 


respectively. 
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Fig. 4.5 Breit-Wigner distribution for Z boson (M = 91.2 GeV, and l = 2.5 GeV) 
4.2.6 Breit-Wigner Distribution 


The Breit- Wigner distribution is used to express the probability density for the energy 
of an unstable particle with a mass M and a decay width I" (and mean lifetime of 
t = 1/I’). The Breit-Wigner distribution is defined as 


1 rp 


= Mae 


(4.23) 


and shown in Fig.4.5. The expectation value and variance of the Breit-Wigner are 
not well-defined, since the integral of Eqs. (4.6) and (4.8) is divergent. Instead of 
them, the peak position of M and the FWHM of I represent the distribution. 


4.2.7 Exponential Distribution 


The exponential distribution is used to express the probability density of the existence 
for the unstable particle with a mean lifetime of t. The exponential distribution for 
a continuous variable 0 < x < œ is defined as 


P(x) = Let , (4.24) 


using one parameter t. The expectation value and variance of x are derived as 


1° x 
p= frewa = zf xe tdx =T, (4.25) 
T JO 
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1 f” P 
= Jo — w)? P(x)dx = zf (x —t)*e7rdx = T°. (4.26) 
T JO 


4.2.8 x? (Chi-Square) Distribution 


In case n observables x; independently obey the normal distributions N; (ui, oi) = 


E 
exp (= i) , the x? value defined as 


1 
JV 210; 20? 


n a : pd 
w=) ee 2 (4.27) 


i=l 0; 
is used as the test of a hypothesis, which indicates how well the expectation matches 
with the experimental data. If the hypothesis predicts the nature properly, the (x; — 
ui)? is expected to be the variance of the experiment, i.e. o°. Thus, x? /n is expected 
to be 1. 

If x?/n shows significant deviation from unity, either the hypothesis or the esti- 
mation of the o’s is wrong. The probability density function of this x? distribution 
with n degrees of freedom (dof) can be written as 

z” /2—1 e= 12 


PG = SF GD 


(z > 0), (4.28) 


where T is the T function. Figure 4.6 shows the x? distribution f (x7, n) for dof n = 
1 to 5. For large n, the probability density function of this x? distribution approaches 
the normal distribution with a mean and variance of u = n, o? = 2n, respectively. 


4.3 Error Propagation 


As the cross section is determined by the values of the five parameters in Eq (4.1), 
a physical quantity is often derived from several parameters which is determined 
by measurements. Naturally, the uncertainty of parameters carries over into a 
physical quantity. Lets assume a physical quantity u depends on ith param- 


eters x;. Namely, the u can be written as a function of x; (i = 1,2,..., n): 
u = f (x1, X2, ..., Xn) = f(x). The expectation values and uncertainties of x are 
known as 4 = (H1, U2, ..-, Hn) and oj = (01, 02,..., On), respectively. A first- 


order expansion of the function f(x) around the expectation value m can be written 
as 


fÆ) x fm) + D ors) 


(xi — Mi). (4.29) 
x= 


46 


(x2, n) 
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Fig.4.6 x° distributions. See the main text for details 


Because E[x; — u;i] = 0, the expectation value of u is represented at the first order 


to be 


El f(x)] ~ f(m), (4.30) 


and the expectation value of u? is 


ELP O] ~ f’) 


L af (x) “Of (x) 
+E (xi — Hi) (xj — uj) 
(x Ox; x= w) 2 aa x=p Taa 
P 2 
ð 
-pwy ( ie) El (xr = 14)?] 
i=l PIS h 
“af (x)| afa 
+ ELi — mi)(xj — uj). (4.31) 
3 Əxi leap 9X7 Ixu j j 


In case x; and x j are not correlated, the third term of Eq. (4.31) is 0. Because 


E[(xi — uiy] = gf, the variance of the u can be calculated to be 


n a 
of = ELP œ- (Elf (ey)? ~ ( re) 


i=] 


2 
ož. (4.32) 
x=u 


Suppose in the cross-section measurement, Nobs, Nokga, L, A, and £ can be measured 


independently. In this case, (the square of) the uncertainty of the cross section (a i 


o ae, 
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can be expressed as 
2 2 2 
go AS dOphys \ ee ch dOphys \ oo Hh dOphys \) 2 
Ophys 0 Nobs 0 Nokgd OL L 
Nobs Noked 


do, hys z do, hys 2 
(Ge) bt Y as 


Imagine you measure the efficiency of the particle detection for a detector. When 
N particles passed through the detector, the detector gives “hit” signal nı times and 
“miss” signal nz times (N = nı + n2). In this case, the detection efficiency is given 
ni 
nı n2 
correlated and the their uncertainties are estimated as ./n; and ./n2, respectively, 


jel —e 
you can show the uncertainty of efficiency, o, to be aa Similar discussion 


nyom . 
, instead of £. Show by yourself the 
nı + n2 


by £ = 


. If the nı and n2 are large enough to consider that they are not 


can be done for the asymmetry A = 


uncertainty of the asymmetry, o4. 


4.4 Maximum Likelihood Method 


Although we can never know the true values of physical quantities, we can esti- 
mate them from a set of the measurements. Consider that we made n independent 
measurements and obtained n measured quantities x = (x1, X2, ..., Xn). Suppose 
that the measured quantities x distributes a probability density function f (x;; 0) 
(i = 1,2,...,), where 0 = (01, 92, ..., Om) are unknown physical quantities. The 
likelihood function L(x; 0), which is regarded as the probability to have a set of 
measurements of (x1, X2, ..., Xn), is defined as 


L(x; 0) = f Œ1; 0) f(x2; 0) - f&n; 0) = [re 0). (4.34) 


If the hypothesis constructing the probability density function f (x; 0) and parameter 
values @ are correct, one expects that the L gives maximum. To estimate the most 
probable values of 0, the maximum likelihood estimators for 0 are defined as the 
values which maximise the likelihood function. As long as the likelihood function is 
a differentiable function of the parameters 0, and the maximum is not at the physical 
boundary, the estimators are given by solving the simultaneous equations 


aL dlnL 
—=0, or 
00; 00; 


=0, i=1,2,.., m. (4.35) 


Because of the characteristics of the logarithm, maximum log-likelihood estimators, 
which are equivalent to the maximum likelihood, are often used. To distinguish 
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the true values of physical quantities (0 = (01, 02, ...9m)) from their estimators, the 
parameters satisfied with Eq. (4.35) are written as 0 = (61, 62, ...Am). 
As an example, let’s consider that a variable x, which obeys a Gaussian distribution 


with unknown jz and o°, is measured n times. The log-likelihood function is 


n n 
1 (ai = w? 
In L(y, o?) = X nfa; u, o?) = Xoin age exp ( = ) 
i=1 


i=l 


n a 2: 
=y (- In V27 Tmo? (i — 4) ) (4.36) 
i=l 


o2 
_ Oink ae ; 
By solving ——— = 0, jz is obtained as 
ðu 
Xi. (4.37) 


The expectation value of Å is an unbiased estimator for u: 


E(f) = wp. (4.38) 
: i . : cee . dOlnk . 
This calculation can be found in Appendix A.3. Similarly, solving 552 = 0 gives 
o 
o2 
+, N , I< a 
o? =>) Gi- a) = = Dai - fy. (4.39) 
i=l i=l 


Because jz is an unknown parameter, Ô is actually used to estimate o. Computing 
the expectation value of o°, it gives 


n—l1 
—o?, 


E[o2] = (4.40) 


n 


which means that the estimator o? is biased, because using Ô instead of u reduces 
the number of dof by 1. Instead of o?, 


n n 

22 A yo 1 2_1 O a2 

s = M 2i u) = Q) (4.41) 
= l= 


n— 


may be used as a more correct estimator, but the difference between them is ignored 
when n is large enough. 
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4.5 Least Squares Method 


Suppose y = f(x; 01, 02,..., Om) is a function of x and you want to determine m 
parameters of 0 = (01, 62, ..., Om) by measuring parameters y; at the n points of x; 
(i = 1,2,...,). When the uncertainties of measurements y; are given by oj, the 


parameters can be estimated by finding the values of the parameter 0 that minimise 
the following quantity: 


n 


i — fi; 0)" 
r= Oi = f i; OY a cs yr (4.42) 
i=l i 


This method is called the least squares method. This method is equivalent to the 
maximum likelihood method described in Sect. 4.4. In fact, when f (x;; 4, 07) obeys 
a Gaussian distribution, x is identical to —2 1n L (u, o°). 


4.6 Statistical Figure of Merit 


When we discuss the statistical significance of observed events, the following figure 
of merit is often used, 


N signal _ N, signal 
J Nobs V N, signal +N background 


(4.43) 


Because the o = ./Nobs shows the statistical uncertainty of total number of observed 
events, the figure of merit above is the indicator to show how significant we have 
the signal over the background in units of o. For example, suppose 10000 events are 
observed and 9500 events are expected as background events after a certain event 
selection, the figure of merit is (10000 — 9500)/( 10000) = 5o. If the Npackground 
is much larger than Nsignal, one can use 


N, signal 
y N background 


The higher the statistical figure of merit, the more sensitive the measurement can 
be expected. Note that in this discussion, only statistical uncertainty is taken into 
account. If you need to consider systematic uncertainties, the figure of merit becomes 
more complicated. 


(4.44) 


4.7 Hypothesis Test 


A hypothesis test is a method to describe how well the data agree or disagree with 
a given hypothesis. The hypothesis under consideration is called the null hypothesis 
Ho. This hypothesis Hp is compared with a so-called alternative or test hypothesis 
H; in order to quantify the compatibility of Ho. In practice, Hı is a hypothesis we 
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Fig. 4.7 Probability density 0.14 
distributions for Ho (left) 
and H; (right) hypotheses. 0.12 
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would like to see, for example, the presence of a new particle. In other words, Ho is 
a hypothesis we would like to reject. 


4.7.1 Discovery and Exclusion 


According to the Neyman-Pearson lemma, the likelihood ratio L(Họ)/L(H;) is the 
optimal discriminator for the hypothesis test Ho versus Hı such that we often use 
the likelihood ratio as a test statistic t. However, in this section we use the num- 
ber of events we select as a test statistic assuming they follow Gaussian distribu- 
tions (Eq. (4.15)) to understand p-value, etc. intuitively, where we discuss discovery 
and exclusion using the number of selected events. Suppose that a Gaussian dis- 
tribution Ng(x) is for background-only (the SM) and the other one Ns+p(x) is 
for signal+background (the signal is a new physics beyond the SM). Figure 4.7 
shows these two Gaussian distributions. Since the x-axis is the number of events, 
the Ns+p(x) distribution is present in the right side of Ng (x). Here, we assume that 
the background-only model is the null hypothesis Ho and the signal+background is 
for H,. For a hypothesis test, we determine a threshold x"""*s to define a significance 
level a. In this case, the «œ is defined as 


(oe) 


a= Nyp:B(x)adx, 


xthres 


which is shown in Fig. 4.7. Then, if Ho is false and H; is true, the probability to 
reject Ho correctly is called a power 1 — B where the £ is defined as 


thres 


* 
p= f Ny;:s+B œ)dx, 
=00 
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Fig. 4.8 a po for discovery and b 1 — p for exclusion. They are evaluated for the number of 
observed events (“obs”). In case of MC studies, the number of observed events is replaced with the 
median of signal+background and background-only for (a) and (b), respectively 


which is also shown in Fig. 4.7. These «œ and £ also correspond to type I error (false 
positive) and type II error (false negative), respectively. The former is the probability 
to reject Ho wrongly and the latter is to reject Hı wrongly. Then, we assume that we 
obtain the number of events x°% from the data. We define a p-value p, which is a 
probability to show the compatibility with Ho: 


oo 
p= | Nupo:B (x)dx. 
obs 


When the value of p is smaller than a, we can say that Hp is rejected by the signifi- 
cance level of a. 

Figure 4.8a shows an example of the discovery, where Ho is the background- 
only and H; is the signal+background. We use the p-value po under Ho to claim 
the discovery of a new particle.? Conventionally, if po is smaller than 2.87 x 1077, 
what we observe is very rare under Ho such that we consider that Ho is rejected. The 
p-value can be transformed to z-value, which is defined using a standard Gaussian 


distribution as 
e- J x2 
p= f e 2dx. 
Z 


J2n 


For p = 2.87 x 1077, z-value corresponds to 5ø , which is shown in Table 4.2. When 
we investigate new physics models using MC samples (MC studies), the observed 
x08 js replaced with the median of the Ny; :s+B (x) distribution. To claim the so-called 
“evidence” instead of “discovery”, we conventionally use p = 1.35 x 1073 (30). 


2 The suffix of 0 is often used for the background-only. 
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Figure 4.8b shows an example of the exclusion of a model, where Ho is the 
signal+background and H is the background-only. When the value of (1 — p) under 
Ho is smaller than 0.05, Ho is rejected. We call it “95% exclusion.” In case of MC 
studies, the x°S is replaced with the median of the Ny, :B (x) distribution. The (1 — p) 
is denoted as CL;+, in high-energy experiments since the (1 — p) value corresponds 
to compatibility with the signal+background hypothesis. Furthermore, in the LHC 
experiments, a CL,-based exclusion is often used instead of CLs». CLs is defined 
as 


obs 


CLs4b x cobs 
CL; = = Nyp:s4-B (x) dx / Nu, -B(x)dx. (4.45) 
CL, —oo —co 


In case of MC studies, the denominator is 0.5 because x°>’ is the median of the 
Ny,:B(x) distribution; CLs is 2CL;s+» so that the 95% exclusion using CL; corre- 
sponds to the (1 — p) of 0.025 for CL;4,. The CL, is not a probability but in order 
to avoid incorrect exclusions, which could be possible when the expected signal is 
small, the LHC experiments often use it to claim exclusions. 


4.7.2 Profile Likelihood Fit 


Suppose we count the number of observed events n after applying our event selection. 
This parameter of n follows a Poisson distribution with an expectation value of 
jus + b, where s is the expected value from a signal model, and b is the expected 
value from background processes. The likelihood function can be defined as 


ites = LAE reat, 


The parameter of u is a scale factor of the signal and is called a signal strength. 
Given s and b, we can extract a u value from a fit to data, which gives the value of n, 
using the maximum likelihood technique explained in Sect.4.4. The value of u be 
around unity if the data follows the assumed signal model, while it is close to zero 
if the data follows the SM, that is, the data contain background only. 

We modify this likelihood function by adding more terms. Since the s corresponds 
to LophysAé of Eq. (4.1), we can consider systematic uncertainties from these param- 
eters (L, Ophys, A, €). For example, the uncertainty on the integrated luminosity, the 
scale uncertainties (factorisation and renormalisation scales) on the ophys, the uncer- 
tainties from jet energy scale, etc. on the £ and so on can be systematic uncertainties 
for u. We often use Gaussian terms* to constrain the signal term and also other 
terms in b. Furthermore, in most of data analyses, we estimate the background b in 


3 In some experimental results, 90% is also used instead of 95%. For 90%, the value of (1 — p) is 
set to be 0.1. 

4 A log-normal term, etc. can be used instead of a Gaussian term, for example, if we require a 
positive definition. 
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the signal region, which is defined by our (signal) event selection, by using a so- 
called control or sideband regions in data with the help of MC samples. In this case, 
the b in the signal region can be described with m¢(ai¢)b, where b is the number 
of events in the control region, and ng is a scale (transfer) factor from the control 
region to the signal region. The mr is obtained from both data and MC samples so 
that some additional constraints (a) are possible. In the end, one of examples of a 
final likelihood function can be written as 


L(u, 0) = Pois(n|ns(as)s + me (ore)d) - N(as|0, 1)- 
Pois(m|np(ap)b) - N(ap|0, 1)- 
N (ate |0, 1), 


where 0 = (b, ds, «p, a), ni(æ) = ni +0oiœ, Pois(n|w) = we “/n!, and 
N(x|u, 0) = 1/(V2m0) - exp(—(x — w)*/(207)). The m is the number of observed 
events in the control region. The @ is called a set of nuisance parameters, which are 
determined by the likelihood fit with the jz. The n; is a scale parameter for signal, 
background, and so on. The a; is a parameter to adjust the 7; through a Gaussian 
constraint. Parameters u; and o; for n; describe centre values and their uncertainties 
and are evaluated from other studies before the likelihood fit. If the œ; value is 0, the 
value of 7; becomes u;i. If not, the value of 7; is varied from its centre value. Practi- 
cally, u; is close to 1. Then, the effect on the signal strength u from each constraint is 
determined in the maximum likelihood (ML) fit. It means that the systematic uncer- 
tainties on the u from each constraint term are simultaneously determined with the 
u Value itself. We call this procedure a “profiled” fit. When the pre-studies on jz; and 
oj are proper, the values of œ; are expected to be close to 0 + 1. For example, if the 
error of œ; is smaller than | (say 0.3 or 0.4), it means that the value of o; given from 
the pre-studies is tightly constrained from data used in the ML fit, for example, data 
of control regions. If this is not expected, some additional studies might be required 
to understand such small values. 


4.7.3 Profile Likelihood Ratio 


We introduce the following likelihood ratio as a test statistic t,,: 


ty = —2IndA(u), 
Lu, 6 
wie 
L(i, 0) 


where the denominator of (1) is maximised for both u and @ (an unconditional 
ML fit) but the numerator is maximised for 0 with respect to a specified jz value (a 
conditional ML fit). Since the denominator corresponds to the best fit to data, the 
value of A(z) is O < A(w) < 1 so that the value of t, is O or positive. When the 
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numerator with a specified u value follows the data, the t,, can be small, if not, the 
t,, becomes large. The p-value is defined as 


(oe) 


Pu = f(tulu’)dty, 


tu,obs 


where t,, obs is the value of the observed t,, and f(t |u’) is the probability density 
distribution of t,, under the assumption of the signal strength jz’. The advantage of the 
use of this test statistic is that the distribution of t, follows a x? distribution of one 
degree-of-freedom: f (tylu) ~ Xite ı (tu), so that we can evaluate p-value without 
toy Monte Carlo. We explain the overall idea of the discovery and exclusion using 
this test statistic below but the technical detail of the hypothesis test using this test 
statistic can be found in Ref. [1]. 

In high-energy experiments, we search for a new signal particle by checking an 
excess over the expected events of a background-only assumption. The signal exis- 
tence corresponds to u > 0. For this case, an alternative test statistic f, = —2 In (u) 
is introduced, 


_ [ha a > 0) 
Mp) = L(ft,0 (ft) (4.46) 


Lu ÂD) pA 
70,80) (ù < 0), 


where the best-fit u value with a deficit (A < 0) is replaced with u = 0. 


4.7.3.1 Discovery 
We test u = 0, that is, we reject the null hypothesis Ho of u = 0 (background-only). 
We use a special notation go = fo for this case. From Eq. (4.46), we use 


~2in (0) = —2In Cô (h > 0 
qo = n ( ) n LIG (ft > ) 
í (A < 0). 


We get a single value of go from data, ae and evaluate p-value po using 


CO 
po= f Falda 
CO 


where the f(qo|0) is a distribution of go made under the assumption of u = 0. 
Figure 4.9a shows distributions of go for the assumption of u = 0 and 1: f(qol0) 


5 For f (t,|u’), we can use a noncentral x? distribution of one degree-of-freedom. 
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and f(qo|1). Once we obtain a distribution f(qo|0), we can evaluate po using a". 

Then, when the po value is smaller than 2.87 x 1077, we can claim discovery.° 
We can approximate the f(go|0) distribution as follows: 


1 1 1 
2/2 / 90 


By using this equation, a z-value Z can be obtained as 


Zea. 


The 5ø discovery corresponds to q8 = 25. In Fig. 4.9a, the value of a is 23, 
which is just an example, so that we cannot say the “discovery.” In case of MC 
studies, as shown in Fig. 4.9a, we can use the median of the f(go|1) distribution 
as an In this case, ge | Ff (go|1)] is smaller than 25 so that we cannot claim the 
“discovery”’ by a physics model of s = 20. 


1 
f (qo|0) = 5 5(G0) + e 40/2. 


4.7.3.2 Exclusion or Upper Limit 

We test (A 0), that is, we reject the null hypothesis Ho of the signal+background 
model. When a specified jz value is equal to or smaller than the Å of the unconditional 
ML fit, we consider that q, is 0. It means that the exclusion of models is performed 
for only u values which are larger than the observed best-fit u. We define q, as 


—2Indk(u) (Â <n) 


qu = r 
"jo (fi > u). 


We evaluate p-value p, using 
[0,6] 
/ 
Pu = f fulu dqu, 
abs 


where the f(q,|u’) is a distribution of g, made under the assumption of pu’. 
Figure 4.9b shows distributions of q„=1 (simply q1) for the assumption of u = 0 
and 1. When the p, value is smaller than 0.05, we can claim 95% CLs+p exclu- 
sion. This corresponds to gos > 2.69(= 1.647). In Fig. 4.9b, the observed qı (23 


6 In some special cases, we may take into account so-called look-elsewhere effect. In this case, the 
standard po value is called a local po and the po after the look-elsewhere effect is called a global 
po. In case of the Higgs search in the LHC experiments (see Sect. 8.1.2), this effect was taken into 
account because the Higgs mass is unknown and we searched for a Higgs signal in a certain mass 
range which is much wider than resolutions (Higgs natural width ® detector), where we expect 
some Statistical fluctuations even if we would have a true Higgs. The practical method is discussed 
in Ref. [2]. 

7 The value of g™4[ f (go|1)] is larger than 9 so that we may say the “evidence.” For the “discovery”, 
we may need s ~ 36 in this example. 
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Fig.4.9 a f(qo|0) with f (qo|1) for discovery b f (q1|1) with f (q110) for exclusion (upper limit). 
J (*|0) and f(«|1) show distributions for background-only and signal+background events, respec- 
tively. A likelihood function of L(ju, 0 = b) = (ustby" o—(ustb) ` we is used, where variables 
of n and m are the number of events observed in the signal and control regions and s = 20 and 
b = 10 are used in this example; b in the signal region is estimated from the value of b in the 
control region. Dashed curves are central (for blue) and noncentral (for red) x? distributions of 
one degree-of-freedom. For the noncentral cases, so-called Asimov data, which is defined as data 
produced with the expectation values of inputs (s, b, and jz), is used to evaluate a width required in 
an approximate formula of f (qlw) [1] 


as an example) can claim the “95% CLs+4, exclusion”, where a model of s = 20 
cannot be explained. Practically, we need a scan of u values to find a u value having 
Py = 9.05. This corresponds to u ~ 0.4 in case of Fig. 4.9b. For 95% CL, exclu- 
sion , we need a distribution f (qu|0) to evaluate CL, = Jae f (qu l|0)dq,,. In case 


of MC studies, as shown in Fig. 4.9b, we can use the median of the f(g,,|0) distri- 


bution as qo” and CL; = 0.5. For 95% CL, exclusion of MC studies, we can use 


ques > 3.84(= 1.967). 
For the case where we consider models with u > 0, we can define and use an 
alternative test statistic q,,: 


n Lub w) (À < 0) 


L.(0,8(0)) 
fu = į 2 Jp LOW) sf eg 
Laig ‘=H 20) 
0 (ù > u). 


The procedure similar to the case of g,, can be applied [1]. 


8 Note that the integration range is different to the case of Eq. (4.45). 
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Detector Calibration 


The information needed when analysing data, for example, the cross-section mea- 
surements, is the 4-momentum vectors of particles in interest, and possibly the knowl- 
edge about the species of the particles. For this purpose, the detector for the high 
energy physics is usually designed so that it allows us to measure the momentum 
or energy, and information for the particle identification, such as velocity. However, 
the recorded data as they are do not tell us anything. They are just a bunch of digits 
which are not energies or positions of particles if they are not properly translated 
into meaningful physical variables. 

This chapter describes the procedure to retrieve meaningful information that is 
needed in physics analyses from raw data. This process is called calibration, and is 
one of the most important processes in the whole flow of the high energy physics 
experiments. 


5.1 From Raw Data to Meaningful Information 


Let’s first imagine how data is generated and recorded. As an example, suppose we 
take data of electromagnetic calorimeter consisting of the sandwich structure of lead 
and scintillator. A photon hitting the calorimeter generates position-electron pair by 
photon conversion mainly in the lead plates. The positron or electron ionises the 
scintillator, resulting in scintillation light. This scintillation light is detected by some 
photo-sensors such as the photomultiplier tube (PMT). The electrical signal from 
the PMT is then converted by an analog-to-digital converter (ADC) and recorded as 
the series of a bunch of digits. Ideally, the light yield of the scintillator is linear to 
the energy deposit by the electron or position, and also the PMT output is linear to 
the scintillation light. With the assumption of the linearity of the light yield and the 
PMT response, one can measure the energy deposited by the electron and positron, 
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or ultimately the incident photon energy from the ADC counts in principle. However, 
let’s recall that what we have here is just ADC counts which are solely digits. They 
represent the energy, but do not mean energy yet without the proper conversion to 
energy. This important procedure to convert the ADC counts to energy is called a 
“detector calibration”, in short “calibration”. 

In the above example, we discussed the concept of energy calibration of the 
calorimeter. This concept is very common for all detectors, whatever they measure. 
Sometimes, we measure the time interval of some detector signals, in which the data 
is recorded by time-to-digital converter (TDC). In this case, the calibration from 
TDC counts to time is needed. Sometimes, we measure the location of a charged 
particle hitting the position-sensitive sensor. In this case, the hit information has to 
be interpreted as the position information. This is also considered as a calibration 
in a sense. In the following sections, we discuss some concrete procedures of the 
detector calibrations. 


5.2 Detector Alignment 


The tracking device usually consists of many finely segmented channels. Assuming 
we know the location of each channel, we can measure the position of charged 
particles by sensing the signal from each channel. This means that the accurate 
and precise knowledge of the location of the tracking device, or more precisely the 
position of each channel, including the angles hence six degrees of freedom, is crucial 
to measure the hit position of the particle at the detector. The procedure to retrieve the 
position of the tracking device or each channel is called “alignment”. Not only the 
tracking devices but also any other detectors consisting of multi-channel detectors 
need to be aligned as well. 

The alignment procedure can be divided into two steps. The first one is the mechan- 
ical measurement or survey of the detector component. In the survey, the location of 
the large structure of the detector is measured, which has to be carried out usually 
before starting data taking or just after installing the detector. Since the detector 
element of the large structure is assembled from the small components of sensors, 
etc. with some precision which is specified in each experiment, the position of each 
channel is considered to be known with the precision of the assembly (and the survey) 
once the survey is performed. 

The second step is the alignment using charged particles. The idea is to use such 
charged particles as a probe. Suppose we have tracking device composed of five 
layers of silicon strip detectors, and we would like to align the sensors of a particular 
layer. In this case, a special tracking algorithm, which does not use the information 
on the layer that will be aligned, should be prepared and the particle trajectory 
reconstructed. Then this track is extrapolated to the layer under alignment to get the 
so-called residual, the difference in position between the extrapolated probe track 
and the hit on the layer. For this reason, a higher momentum track is preferred to 
minimise the extrapolation uncertainty due to the multiple scattering. Here, when 
we say a “hit”, it is based on the hypothesis or the prior knowledge of the location 
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of the silicon strip sensor. If this prior knowledge is wrong, the residual cannot be 
zero. In other words, the sensor position that gives the zero residual is likely to be 
the true position. Based on this general idea, the layer under the alignment is aligned 
by adjusting the location of the sensors so that the residual distributions have a peak 
close to zero and the width to be narrow. 

In the actual application, a slightly different approach is taken although the basic 
idea is the same as we have just explained. Because there are many layers and 
millions of channels in the tracking detectors of the modern collider experiment, it 
is time-consuming and complicated to prepare such tracking algorithm that doesn’t 
use the information from the specific layers or channels. Instead of having such a 
special algorithm, itis more common to use the normal track reconstruction algorithm 
with looser quality requirements to minimise the bias arising from the usage of hit 
information from the layer or channel under alignment. For each probe track, again 
the residual is measured. But not only for a single layer or channel but also for all the 
layers or channels in the detector under alignment, the residual is computed. Then 
the sum of residuals from many layers is calculated for each track. The detector is 
aligned so that the total residual is minimised. This is almost equivalent to the x? 
minimisation where the positions of the sensors are fitted. 

So far in this section, the basic concept of the alignment was given, where we 
discussed the alignment of the single detector. But the collider detector, for example, 
is a more complex and larger object consisting of several types of detectors. Further 
in the actual application, the alignment is performed in several steps. Again, using 
the silicon strip tracker in the ATLAS experiment as an example, the first level is to 
align the whole tracker relative to the other detector system. This means that not each 
layer nor single sensor is individually aligned. Instead, the whole support structure 
holding the sensors or modules is aligned as a single object. Then as the second-level 
alignment, each layer is aligned, i.e. each layer can be moved independently. Finally 
as the third level, the individual module or sensor within each layer is aligned. In 
this way, the failure of the x? fitting due to the possible large deviation of the initial 
value from the actual position can be avoided. In addition, the step-by-step approach 
allows saving the computing time of the x? fitting. 

Figure 5.1 shows the residual distributions for the ATLAS silicon pixel detector, 
where the first alignment was carried out by using cosmic rays and then proton-proton 
collision data for more statistics. You can see that the width becomes narrower by 
using the collision data, indicating the improvement of the alignment. Note that an 
old result, which was obtained at the very beginning of the experiment, is intention- 
ally presented here for the illustration purpose. Currently, the width of the residual 
distribution is close to that for the simulation result where all the detector positions 
are perfectly known. 


5.3 Momentum Scale Calibration of Magnetic Spectrometer 


We first explain the concept of measurements of charged particle momentum. Then 
the calibration of the momentum scale is discussed. 
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Fig.5.1 The residual distribution for the silicon pixel detector of the ATLAS experiment. Reprinted 
under the Terms of Use from [1] ATLAS Experiment © 2022 CERN. All rights reserved. Red (black) 
shows the residual using proton-proton collision (cosmic ray) data. Blue shows the prediction by 
simulation. Note this plot is intentionally selected for the illustration purpose of the effect of the 
alignment, not showing the current precision 
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5.3.1 Momentum Measurement and its Resolution 


Suppose a charged particle travels in the magnetic field of B (in Tesla) with the 
radius R (in meter). Also suppose we measure the charged particle positions by 
position-sensitive detectors D1, D2, and D3, as shown in Fig. 5.2. The momentum 
of the charged particle pr (GeV) can be written as pr = 0.3 x B (Tesla) x R(meter). 
Because the angle @ in Fig. 5.2 is geometrically represented as a ~ R’ the depth of 


the arc called a sagitta (s in meter) of the particle trajectory can be expressed as 


2 0.3817 
s=R( = a, (5.1) 
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where L is the chord of the arc in meter. In case the track position at the three detectors 


is measured as x; + ox, X2 + oy, and x3 + ox (with a common uncertainty of ox), 


ne xı +x ; TRE 
the sagitta is s = x2 — st The uncertainty of the sagitta is o Therefore, 


the momentum resolution can be represented as 
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(5.2) 


In the same manner, in case s is measured at N points (N is more than about 10), 
the momentum resolution is represented as 


Opr _ Ox- pt | 720 
pr 0.3BL2V N+4' 


(5.3) 


From these calculations, you can see that the momentum resolution is proportional to 


the momentum of charged particle and the uncertainty of the position measurement 
o 

PT 
l P 
detectors. If you want to have better momentum resolutions, more detectors should 
be placed in a wider space where a stronger magnetic field is provided. This can be 
imagined if you draw the arc with 3 or more points in a limited space and estimate 
the curvature of its arc. For which can you estimate more precisely, an arc with a 
smaller radius or an arc with a larger radius?! 


X Ox - pr), and the inverse of the magnetic field and the square of the length of 


5.3.2 Momentum Scale Calibration 


Going back to the calibration topics, a measurement of the trajectory of charged 
particles, or more specifically the sagitta, geometrical information of the tracking 
detector and knowledge of the magnetic field strength are necessary to derive the 
momentum, just as we have seen. Therefore, we don’t really need the momentum 
scale calibration for the magnetic spectrometer in a sense, i.e. there are no conversions 
from a certain information to another such as the charge-to-energy conversion in a 
case of the energy measurement by a calorimeter. 

But in most of the experiments, in situ calibration or correction of the momentum 
scale is performed for better accuracy and precision. A common technique is to make 
use of the known mass of some particles, for example, Ks, J/w or Z. The momentum 
scale of the reconstructed tracks is calibrated or corrected so that the peak position 
of the invariant mass distribution reconstructed from two tracks becomes the world 
average value” of Ks, J/w or Z. Figure 5.3 shows the invariant mass reconstructed 


! The answer is “with a smaller radius” (under the same B and L). 

2 A physics quantity is measured by several different experiments. Such results are combined, that 
is, “averaged”, for example, by the particle data group (PDG) [2]. Such combined results are called 
world average values. 
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from two oppositely charged muons. As you can see, with this calibrated data, the 
peak position is consistent with the world average value of Z. 

The particle used as the calibration target depends on the experiments because of 
the limitation of the available particles. The data sample with high purity is always 
preferable to avoid uncertainty due to the background. At the same time, the large 
data set is also preferable to reduce the statistical uncertainty. The experimentalist 
has to consider the optimal use of the various calibration samples. 

This section was devoted to describing the momentum scale calibration or correc- 
tion. But Fig. 5.3 shows the other important point which we would like to mention. 
It shows that the resolution depends on the alignment. As can be seen in Eqs. 5.2 
and 5.3, momentum resolution has a linear dependence on the precision of position 
measurement for a track. Therefore, better alignment leads to better resolution. The 
figure shows that better alignment is used when the data was reprocessed. 


5.4 Energy Calibration of Calorimeter 


The energy calibration procedure for the calorimeter is classified into two steps. The 
first step is to calibrate each cell or channel, and the second is to calibrate the energy 
of the particle incident to the calorimeter, equivalent to the energy of the shower 
after clustering. These approaches are slightly different for the electromagnetic and 
hadronic calorimeters. Below, we discuss the concept of these two-step calibration 
procedures for the calorimeters. 


5.4.1 Cell-by-Cell Calibration 


In most cases, the energy information of the calorimeter is recorded as the digital 
number that is converted by an ADC from the detector output, typically the pulse 
height or charge created by the sensor. The goal of the cell-by-cell calibration is to find 
the relation between the energy deposit and the ADC count for each channel, which is 
a conversion factor. A set of the factors for all the cells are called calibration constants. 
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To get this calibration constant, the most powerful and a very common technique 
is to use a muon as the calibration source, because the muon in high energy physics 
experiment behaves as almost a minimum ionising particle (MIP) that deposits the 
constant energy per path length. The tracking system allows to measure the path 
length across the cell of the calorimeter, and hence to expect the energy deposit. 
In this way, one can obtain the ADC counts for unit energy. Only the muons can 
be this kind of calibration source, because the other charged particles evolve either 
electromagnetic or hadronic shower in materials, and their energy deposits are not 
constant. On the other hand, a muon deposits its energy just by ionisation loss, 
resulting in a rather constant energy deposit per unit path length. In the energy frontier 
collider experiments, muons decayed from Z bosons are one of the cleanest samples. 
They are isolated, i.e. there are no other particles nearby, and have high momentum. 
The higher the momentum. the multiple the scattering angles are smaller. This means 
that the error of estimating the path length is smaller. In addition, J/w > ut uT 
events are also used as the lower momentum calibration source. 

The additional advantage of the usage of muons as the calibration source is the fact 
that high momentum muons, which are regarded as MIPs, are available in cosmic 
rays. We can have this ideal calibration source for free everywhere in the world, 
except the underground experimental facilities such as Super-Kamiokande, where 
the rate of muons is very small compared to the collider data. 

In some cases, however, the in situ muon calibration may not be possible. In that 
case, the calibration results before installing the detector or assembling it into a big 
piece are used. For example, beam tests are employed, where the beam energy is 
precisely known. Or radioactive sources are also used because the energy spectrum 
of the emitted particles is well known. 

In addition to the calibration with particles, a common approach is to prepare 
and use the artificially generated calibration source. For the detector whose output 
is lights, such as for scintillators, light flushers like lasers can be used to emulate 
the signal. For the detector whose output is electrical signals, such as liquid argon 
calorimeters, electrical test pulses to the readout electronics are often used. By using 
this kind of calibration sources, the relation between the detector output and the 
ADC count can be identified, although the relation between the detector output and 
the energy deposit is not. Still it is useful, much better than nothing, because, for 
example, relative gains within a detector can be monitored. This is of particular 
importance for the large-scale detectors where it is not an easy task to adjust the 
detector response for each individual channel. For this reason, most of the detector 
systems nowadays are equipped with such a calibration device that also works as the 
monitoring system of the detector performance. 


5.4.2 Energy Cluster Calibration of Electromagnetic Shower 
In principle, once the calibration constant for each cell or channel is obtained, one 


should be able to know the energy of the incident photon or electron to the calorimeter 
just by summing the energy of each channel associated with the energy cluster 
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generated by the photon or electron. In practice, however, simple summing is not good 
enough for many reasons. For example, the energy deposited by the electromagnetic 
shower is much larger than that of muons, leading to the difficulty in the extrapolation 
to higher energy. Or there are dead materials among the active sensors consisting of 
a calorimeter, where missing energy due to the dead materials needs to be corrected. 
A different clustering algorithm may lead to a different energy sum even for the 
same event. Therefore, it is necessary to calibrate the detector in situ with either the 
electron or photon whose energy may be known without the calorimeter information. 
In this regard, the electron is a more user-friendly calibration source because other 
detectors rather than the calorimeter under calibration can detect the electron and 
measure its momentum. On the other hand, only the electromagnetic calorimeters 
can detect and measure the energy of photons precisely. This manifests the difficulty 
of in situ photon calibrations in collider experiments. 

The most common and powerful technique using electrons exploits the fact that 
the electron’s energy deposit (= E) at a calorimeter should be equal to its momen- 
tum (= p) at a tracker because an electron deposits all the kinematic energy at the 
calorimeter, and we can safely ignore the electron mass in the momentum region of 
our interests. Besides, for most of the momentum range in our interests, magnetic 
spectrometers consisting of charged particle tracking devices and magnets have bet- 
ter momentum resolution than that of calorimeters. Combining the above two facts, 
the momentum measured by the magnetic spectrometer can be a good reference for 
the electromagnetic scale. Commonly used is the E’/p distribution where electrons 
or positrons make a peak at unity if the detector is properly calibrated. 

Another calibration method, which does not rely on other detectors such as the 
magnetic spectrometers, makes use of the decay of particles whose masses are pre- 
cisely known. The decays of Z > ete~ and J/y — ete™ are commonly used, 
where the calorimeter’s response to the positron or electron is calibrated so that the 
reconstructed invariant mass gets closer to the world average value. The width of 
the invariant mass distribution should be narrower after the successful calibration. 
In the calibration using particle decays, we should be aware that the energy of parti- 
cles in the calibration source is preferred to be close to the interesting range of your 
physics analysis to avoid a large extrapolation; in the above examples, the typical 
electron energy is of the order of 10 GeV in Z —> ete, while only a few GeV or 
less in J/w — e*e~. It means that the former should be used for relatively high pr 
electrons and the latter for low pr. Finally, a decay chain with a high signal-to-noise 
ratio needs to be selected to avoid a possible bias due to the background. 

The calibration method using mass, for example z? —> yy, can also be used for 
the photon calibration in principle. However, it is difficult to find a good decay chain 
which has enough statistics and covers the wide range of photon momenta. A lack 
of good calibration sources for photons is a common issue in many experiments. 
The widely used approach is to rely on the electron calibration because the detector 
response by electron and photon is similar at the first order. They both evolve the 
electromagnetic shower where the only difference is the initial depth of starting 
the shower. For precision, Monte Carlo simulation is often used to correct small 
differences in the detector responses between electrons and photons. Further, when 
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experiments become more mature or have more statistics, rare processes can be the 
calibration source. An example is Z —> u™ wy where the photon is radiated off. 
The statistics is much smaller than that of Z > e+e~. The momentum range of this 
photon is limited because the photon is radiated from a lepton from the Z boson 
decay. Still, Z — y+ 7 y is used as calibration; to be precise, it is a validation tool 
to check the calorimeter response to photons. 


5.4.3 Energy Cluster Calibration of Hadronic Shower 


The energy-scale calibration for hadronic showers is much more complicated than 
that for electromagnetic showers for the following reasons. 

First, the detector response to hadrons is different from that to electrons or pho- 
tons, because of the different shower evolutions. Usually, all the kinematic energy 
of particles incident to a calorimeter is deposited in case of electromagnetic show- 
ers, meaning all the energy can be seen by the detector. On the other hand, some 
parts of the incident energy of hadrons are often lost because the nuclear interac- 
tion length (see Sect. 3.3.2) is relatively longer compared to the size or depth of the 
detector. Therefore, even if an electron and hadron have the same energy and hit into 
the same calorimeter, the “visible” energy may be different. Sometimes, this visible 
energy ratio of the electron and hadron with the same energy is referred to as the 
“e/h” ratio. With a few exceptions, most of the hadron calorimeters have e/h > 1, 
demanding a special correction in the calibration process. 

Second, the fluctuation of energy deposits by hadrons is very large, while it is 
almost zero for the electromagnetic showers if the depth of the calorimeter is thick 
enough. The fluctuation comes from the fact that the hadronic shower sometimes 
creates 7° that immediately decays into yy and loses its energy by the evolution of 
electromagnetic showers. Hence in the case of having 7? in the hadronic shower, the 
visible energy gets larger, and vice versa. Another reason for the large fluctuation 
is due to neutron production in the development of hadronic shower. In the case of 
charged hadron production such as proton, its kinematic energy can be detected as 
the energy deposit by ionisation, while low-energy neutron (« 1 GeV) does not have 
such an energy loss mechanism and is rather transparent in a calorimeter. Therefore, 
the visible energy is influenced by the number of produced neutrons. These are the 
main reasons why there is a rather large fluctuation in the energy loss of the hadronic 
shower. In addition, hadrons such as pion or kaon decay (semi-)leptonically, yielding 
neutrinos that are not detected by the calorimeter. The existence of neutrinos in the 
hadronic shower changes the total energy deposit in the calorimeter. 

For the above reasons, in order to correctly deduce the energy of hadrons incident 
to the calorimeter, special care needs to be taken after the cell-by-cell calibration. In 
the collider experiments, it is rare to have a single hadron incident to the calorimeter. 
Instead, a jet (see Sect. 6.4) is the object handled by the calorimeter, which is an object 
defined by ahuman being. This means that the energy of ajet depends on the definition 
or actually on the clustering algorithm. For this reason, the treatment of jet energy 
calibration is described after introducing the jet reconstruction (see Sect. 6.4.4). 
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Particle Identification 


In this chapter, we discuss the method of particle identification, which is also called 
object identification because what we reconstruct or identify is usually not a particle 
but an object such as a charged particle trajectory, a jet that is a cluster of many 
particles, missing transverse energy and so on. We go through common objects that 
are widely used in high energy physics. 


6.1 Tracking and Vertexing 


A tracking refers to the reconstruction of a trajectory of a charged particle, or a track. 
Once we find such a trajectory under a magnetic field, we can measure the momentum 
of this particle through the curvature of the trajectory. The energy of a particle can 
be estimated with an assumption of the particle type, or with measurements related 
to the particle identification. Thus, the tracking allows us to obtain a 4-momentum 
vector, which is ultimately needed in data analysis. For this reason, the tracking 
capability is one of the most important features equipped in most of the detectors for 
the high energy physics. 

The tracking can be divided into three parts: the measurement of the space hit 
points of the particles in the detector, the pattern recognition to the hit points to make 
a candidate track (referred to as track finding) and the fitting for the candidate track 
to get a smooth track, which is our best guess for the true particle trajectory. We 
discuss these three steps in the following subsections. 

The collection of tracks in an event further allows us to find or guess, for example, 
the particle-particle collision points of the event, or decay positions of short-lived 
particles, such as Ks, A, b-hadrons, t, etc. In either case, more than one particle 
appears from a common location, which we call a vertex. A vertexing refers to the 
reconstruction of the vertex using the collection of the tracks. 
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70 6 Particle Identification 
6.1.1 Space Hit Point 


The tracking starts with searching for space hit points of charged particles in the 
detector. Charged particle deposits its kinematic energy in material by ionisation 
when passing through the materials. This results in some electrical signal or scin- 
tillation light, which will be converted to electrical signal in the end, in the sensor 
material. The sensor for tracking is usually segmented, allowing to know the hit 
position of the charged particle. 

If the tracking device is pixelated, and had 100% efficiency without any fake hits, 
i.e. virtually a perfect detector, and only one charged particle existed in an event, 
the space hit point could be uniquely determined without any ambiguity, and there 
is no further discussion to find the space hit point. However, the real world is not so 
kind to us. The particle could penetrate through more than single segments due to 
the incident angle, resulting in multi-hits in a sensor. Moreover, many particles are 
often generated by particle-particle collisions or a beam hitting a target, and these 
particles may be overlapped and create multi-hits in a single sensor. Or there could be 
false measurements due to the inefficiency or the noise of the detector. First, in order 
to determine or estimate the space hit points, therefore, we have to apply clustering 
techniques to a set of raw hits in the sensor. Once we have a cluster of hits, then, 
the next step is to estimate the hit position of a particle from the cluster. Below we 
discuss the clustering techniques first, and then how to determine the hit position 
from the cluster. 

Many tracking devices used so far are not pixelated. Instead, in many cases, they 
provide one-dimensional information from the wires in a chamber or strip electrodes 
in a silicon sensor. Therefore, we need to convert and obtain the space hit points 
from such one-dimensional measurements, which we will discuss at the last part of 
the following subsections. 


6.1.1.1 Hit Clustering 

Below we consider a position sensitive device in one dimension such as a silicon 
strip sensor, a wire chamber or a fibre tracker. A basic approach to the clustering is 
to group the consecutive hits. This is rather straightforward if the detector is perfect. 
All you need is just to set a (very low) threshold for each channel to define a single 
hit. In the actual experiments, however, there could be some dead channels or noisy 
channels where some particular channels have always a hit regardless the existence 
of a particle. Special treatments are needed for these kinds of deficits. 

For example, in order to avoid a fake track caused by noise hits, we mask noisy 
channels, i.e. ignore such channels when making a cluster. Or if two silicon strip 
sensors can be a pair because they locate so closely, requiring two hits on the pair 
can reduce the fake hits significantly. In contrast, a known dead channel is sometimes 
treated as if it would have a hit when the adjacent two channels have hits. 

In addition to the deficit such as noisy or dead channels, another type of care is 
needed when a large size cluster is generated. A large cluster is possible when a 
particle hits a sensor obliquely. In this case, energy deposit and hence signal size in 
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each channel is small. So the threshold to define a hit needs to be set low to reconstruct 
this type of tracks. But of course such lower thresholds can cause higher noise rates. 
Therefore, an optimisation of the thresholds is always a key to have higher efficiency 
with lower fake hits. 


6.1.1.2 Hit Point Determination 

The method to estimate the hit position depends on how the signal information in 
each channel is stored. In the so-called “binary readout”, where only the location 
of channels with the signal above a threshold is recorded, all we can do is just to 
take the average of the positions of hit channels. For example, if only one hit is 
found in a silicon strip sensor, the location of the strip with the hit is regarded as the 
incident position of a particle. If there are two hits, the particle incident position is 
regarded as the middle of hit strips, and so on. The position resolution of such binary 
readout scheme becomes wave where d is the size of the segment of each channel, 
as discussed in Sect. 4.2.5. 

In case the size of the collected charge at each channel is recorded by some 
means, the centre of gravity in terms of the collected charge can be considered to 
be the particle hit position. Even more sophisticated approach, such as making use 
of a lookup table, is sometimes used. In either way, the position resolution can be 
improved compared to the binary readout scheme. Let’s assume that we now measure 
position along the x-axis with the silicon strip detector whose strip pitch is d, where 
we have two hits. Also, assume Qz and Qp be the collected charge at the two strips. 
With the centre of gravity method, x, which is the particle incident position defined 
as the distance from the right side strip, can be measured to be 
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where S is the total charge accumulated by the two strips. Assuming the accumulated 
signal charge, S, is much larger than the noise, the uncertainty of measuring x can 
be written as 
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where N is the noise. Since the binary readout gives us the position resolution of 
d /~/12, the device with the signal-to-noise ratio (S/N) greater than V12 ~ 3.5 has 
the advantage of using the analog information of the charge (Q). In many detectors 
including the silicon strip sensors, it is very common to have the S/N greater than 10 
or 20. Hence, adding the analog information usually improves the position resolution. 


6.1.1.3 From One-Dimensional Measurement to Space Point 

In many cases, the tracking detector consists of devices that are capable of sensing 
only one-dimensional hit information, although it becomes popular to use the device 
with pixelated sensors recently. For example, suppose that there are parallel wires, 
which make a plane, inside the gas volume. By detecting a signal from the wire, 
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Fig.6.1 Left: One-dimensional measurement by two layers with stereo angle gives two-dimensional 
measurement. Right: Two-dimensional measurement by pixelated device 


one can identify the position of the particle hitting on the plane in the direction 
perpendicular to the wires, but not in the direction along the wires. Adding another 
layer in parallel to the original plane does not change the situation. Instead, placing 
another layer with some tilt angle with respect to the wires in the other layer allows 
us to get two-dimensional hit information. This tilt angle is referred to as stereo angle 
(Fig. 6.1: left). 

The set of two layers with the stereo angle can provide two-dimensional mea- 
surements of a hit position on a certain plane. Since the three-dimensional space hit 
point is needed to reconstruct particle trajectory, one must define the plane of the 
two-dimensional measurements. In case the pixel type sensor is used, the plane is 
defined to be the sensor. In the right of Fig. 6.1, for example, x and y positions are 
determined by the location of pixel with hit, while the z is the plane of the sensor. 
Of course the sensor has finite thickness, and hence the “plane” must be pre-defined 
arbitrarily, such as the front surface or middle of the front and back side surface, etc. 
In case the configuration of two stereo sensor layers, such as two planes of wires 
(left of Fig. 6.1), the middle of two layers is often defined to be the space hit point 
plane. 

The simple or natural idea to get two-dimensional hit information from two one- 
dimensional measurements is to have the stereo angle to be 90°. In many of the 
applications, however, it is difficult to have large stereo angle because of the geomet- 
rical constraints of the detector, especially in the collider detectors where the inactive 
materials such as cables must be minimised for 477 acceptance coverage. (In con- 
trast, the fixed target experiment may not face such space constraints and can easily 
achieve 90° stereo angle.) To minimise the inactive materials, it is preferred to have 
the readout electronics including cables at only one end of the detector. Therefore, 
many tracking detectors that are actually used in the collider experiments have very 
small stereo angle so that the signals from two planes are read out through the same 
direction with a cost of losing the position resolution along the wire direction. 
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6.1.2 Track Finding 


Once a set of space hit points is identified, the next step is to form the candidate 
tracks from the hit points. The idea itself here is simple, i.e. select the space hit 
points based on the track hypothesis where one needs a modelling of the charged 
particle trajectory. If only one particle travels inside the tracking volume without 
any noise hits, there would be no further discussions. All you need is to just connect 
the hit points. However, usually, there are many particles creating many hit points, 
resulting in very complicated hit pattern, plus noise hits. Given a particle trajectory 
model, the human eyes are very good in the pattern recognition, i.e. you can pick up 
a set of hit points to form a track. In the old days when using the emulsion or the 
bubble chambers, it was actually the eyes that worked as the track finder. But in most 
of the high energy physics experiments these days, one must use computers to find 
tracks through tons of the hit points, because of the very high rate of data acquisition. 
This means that algorithms for track finding have to be provided. 

The first thing to consider is the modelling of charged particle trajectories. In 
the collider experiments, since the magnetic field exists in the tracking volume to 
measure momentum in most of the cases, the trajectory would be basically a helix. On 
the other hand, it is a straight line and possibly plus a curvature on the single plane, 
which is perpendicular to the magnetic field for measuring momentum in many of the 
fixed target experiments. There may be more models depending on each experiment. 
Anyhow, the point here is that the pattern recognition runs with a hypothesis on how 
a charged particle travels in the tracking volume. 

The core of track finding is to select a set of hit points based on a given track 
modelling. There are mainly two groups of the track finding algorithms, in general, 
local and global methods, which we will describe below. 


6.1.2.1 Local Method 

The concept “local” means that the algorithm tries to find single track first, and then 
search for the next one once the first one is found. Until no possible candidates 
are found anymore, this procedure is repeated. In this way, multi-tracks are found 
sequentially. 

The local method starts from finding or clustering an initial seed from the hit list. 
Suppose the tracking in the collider experiments such as the ATLAS experiment. The 
tracking algorithm picks up a hit on either the innermost or outermost layer. Here 
let’s suppose that the algorithm looks for a track from the inside to the outside, i.e. the 
hit on the innermost layer is randomly picked up first. Then the hit on the next outer 
layer is searched, based on the hypothesis that the track comes from the proton-proton 
interaction point with a helix trajectory, where the rough estimate of the interaction 
point needs to be provided. The segment formed by the hits on the innermost and the 
adjacent outer layer is examined if it matches with the track hypothesis with some 
criteria. The survived segment becomes the track seed. Once the seed is found, the 
track candidate is extrapolated to the next outer layer and examined again if it has a 
hit. This procedure will be repeated until the track candidate reaches the outermost 
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Fig.6.2 An example of the y 

track and space hit point 

distribution in the plane 
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axis in the collider 

experiment without the 

magnetic fields. The 

coordinate origin is the 

collision point x 


layer of the tracking volume. The examination is often based on the x? testing or 
Kalman filter.! 

Whether starting from inner to outer or from outer to inner depends on the algo- 
rithm. Because the inner layer has denser hits than the outer, starting from the outer 
allows to avoid unnecessary trials and hence save the CPU time. It also can recon- 
struct a track from the secondary vertex whose position is in the tracking volume. 
This type of secondary vertex may appear from the long lived particles such as Ks. 
On the other hand, the algorithm starting from the inside has the advantage in the 
efficiency by trying for any possibilities, even though it costs CPU time. In ATLAS, 
both algorithms are used in parallel. The resultant output, the track candidates from 
both the algorithms are fed into the next stage of the tracking, the track fitting. 

After finding a track candidate, the space hits points that are used to reconstruct 
the track candidate are removed from the hit list for the next track candidates in many 
algorithms. Then the algorithm starts to search for the next track candidate using the 
updated hit list and continues the same procedure until no more candidates, which 
satisfy the selection criteria are found. In the dense experimental conditions such as 
the hadron colliders; however, it may be also a possible option to leave some hits on 
the list, even though they are already used, for the redundancy. The choice is up to 
you. 


6.1.2.2 Global Method 

The concept “global” means that the algorithm tries to find all tracks at a time from 
the list of the space hit points in contrast to the local method. There are other types 
of algorithms that are completely different from the one so far discussed. Actually, 
the variety of those algorithms is wide. So we just pick up and introduce two widely 
used algorithms. 


| The Kalman filter is an algorithm to estimate the unknown status based on a series of observed 
measurement and its error. For example, see Ref. [1] for the detail. 
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Histogramming method For the illustration purpose, suppose we think of find- 


Hough transformation 


ing tracks in the collider detector without any magnetic 
field. Suppose also the particle-particle collision point 
(collision point in short) is known. Figure 6.2 shows 
the example of the space hit points projected onto the 
plane perpendicular to the beam direction. If you plot 
the azimuthal angle, ¢, of the hit points on Fig. 6.2, the d 
distribution will have the seven peaks, each correspond- 
ing to each track. By making such a histogram, one can 
identify the group of hit clusters in an event at a time. 
This is the basic idea of the histogramming method. In 
the actual application with the magnetic field, the coor- 
dinate transformation is carried out. Suppose x and y 
are the values in the original coordinate, then 


X 
x2 + y? 


I 

x? +y? 

will produce straight lines in the (u, v) plane, ending up 
with the simple histogramming method which handles 
the straight lines. 

Another famous example of the global method is the 
Hough Transformation, which is widely used in the dig- 
ital imaging process. Suppose we have a line on x — y 
plane. Given r is the distance between the line and the 
coordinate origin, and 0 the angle between the x-axis 
and the normal of the line in problem, the line satisfies 
the following equation: 


v= 


r = xcos0 + ysin 
This means that any points (xo, yo) on the line satisfy 
r(0) = xocos0 + yosing . 


Therefore, the lines passing the point of (xo, yo) can be 
represented by a sine curve on the (r, 0) plane which 
we call a Hough space. Or the transformation of a set of 
lines on single point on the original x — y plane to the 
sine curve on the Hough space is called Hough trans- 
formation. Let’s now assume that we have five points 
on the straight line on the x — y plane. A set of the 
straight lines passing each point is transformed to the 
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corresponding sine curves on the Hough space. Since 
there are five points now, there are also five sine curves 
after the Hough transformation. Because the five points 
on the x — y plane are on a common straight line, which 
we would like to find, there should be a crossing point 
of the five sine curves on the (r, 0) plane. The crossing 
point (ro, 6o) gives us the straight line on x — y plane, 
i.e. ro = x cos Op + y sin Op is the expression of the line 
where the five points are on. This is how the Hough 
transformation allows to determine a line from a set of 
the space hit points. 

The transformation shown above is for the straight line. 
It is possible to select a different transformation suit- 
able for your application, for example, the curves such 
as the charged particle trajectories in magnetic field. In 
this case, assuming the trajectory to be a helix or a cir- 
cle in the plane perpendicular to the magnetic field, the 
parameters representing the circle are the coordinate of 
the centre, x and y, and the radius, hence there are three 
parameters instead of two in the case of the straight line, 
r and 0. Then by performing the similar Hough transfor- 
mation to the three-dimensional space, one can obtain a 
curved surface for each single point on x — y plane. By 
repeating the Hough transformation from all the mea- 
sured points, we get a set of the curved surfaces. The 
crossing of the curved surfaces represents the circle, 
which is the one we want to determine on the x — y 
plane, or actually a track. 


6.1.3 Track Fitting 


The final step of the tracking is the fitting of the space hit points that are associated 
with each track candidate to the track modelling. The concept of the track fitting is 
rather simple, i.e. it is the least squares method to minimise the difference between 
the measurements and the track hypothesis. More specifically, x7 is defined to be 


N 2 
2 Qi — fxi; 0)) 
posyp Amoi, 
p oj 
where y; is the set of measurements at x;, f (xi; 0) the prediction of the track based 
on some trajectory modelling, o the measurement error, and 6 the parameter which 
you want to obtain by minimising the x. In case of the collider experiments, a 


trajectory of charged particles is modelled by helix that consists of five parameters. 
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Assuming each measurement is described by Gaussian p.d.f., which is the case in 
many measurements, the x can be written as 


N 
LO = Y w- ei MVT- fajo), 


i j=l 


where V is the covariance matrix. This can be further written in general matrix 
notation as 


xX=y-Ð) V y-f), 
where y is the vector of the measurements, and f is the predicted value. The track 
fitting searches for the parameters @ to minimise the x?. There are varieties in the 
approaches for this minimisation. The details of the mathematical treatments can be 
found elsewhere (see Ref. [2] for example). Instead we discuss a few points that need 
to be considered particularly in high energy physics. 


e The Kalman filter is widely used recently because it can naturally handle some 
effects due to the interaction of a particle and the material in the tracking detector, 
such as Coulomb multiple scattering and/or energy loss in the tracking device. 

e The speed is crucial. This is especially true for the experiments with high rate and 
high multiplicity environment such as the hadron colliders. Not only the efficiency 
and/or resolution but also the speed needs to be optimised in the algorithm. 


After the success of converging the track fitting, one can finally obtain the recon- 
structed tracks. As all the track parameters are determined at this point, one can 
deduce the momentum of the reconstructed tracks or the particles at arbitrary position. 
At the collider experiments, the momentum at the collision point is the interesting 
quantity we want to know in most cases. In addition, the complete track parameters 
allow to predict or extrapolate the particle trajectories outside the tracking volume, 
which is sometimes important in the particle identification such as the electron or 
muon identification, as shown in later in this chapter. 


6.1.4 Vertex Finding 


It is important to know the location of the collision point because it is where the par- 
ticle reaction in interest happens, and hence the momentum vectors of the generated 
particles are defined. Therefore, the measurement of the collision point event-by- 
event is a crucial ingredient of the physics analysis. The analysis handling neutral 
particle is of particular importance because the momentum vector of the neutral 
particle cannot be defined without knowing the location of the collision point. 

In the collider experiment, the collision point is often called as a primary vertex. 
There is also another type of vertex. For example, b-hadrons generated by the colli- 
sions can travel a few mm at LHC, and decay subsequently, creating a vertex at the 
decay point of the b-hadron because the decay products emerge and form a kink. 
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This type of a vertex, caused by a decay of some particles, is called as a secondary 
vertex. 

The concept of the vertex finding is simple, i.e. after the tracking one extrapolates 
more than one reconstructed tracks to the direction where the vertex is expected to 
exist. The intersection of the extrapolated tracks can be a vertex. 

The actual vertex finding starts to pick up all reconstructed tracks with some 
selection criteria, which assures the quality of the tracks. The tracks are then examined 
which one should be associated with the same vertex candidate by the least squares 
fitting, or equivalently, Kalman filtering. The track that significantly worsen the x? 
value is removed from the association of the vertex or down-weighted in the fitting. 
The latter technique is called as the adaptive vertex fitting. Until stabilising the x? 
value, the testing of the association of the tracks continues. The fitting after this 
removal process gives us the best possible vertex estimate. 

In order to improve the track reconstruction and the vertex finding, the track fitting 
is repeated after finding the vertex, and then vertex fitting again. This recursive 
process is of particular importance in the dense environment such as the hadron 
colliders where not only tons of tracks exist but also many interaction/collision 
points, hence many vertices, exist per bunch crossing. 

At the hadron collider experiments, there are many interactions occurred per bunch 
crossing as just mentioned. At the LHC, for example, more than 20 interactions occur 
in average as shown in Fig. 6.3, where 65 vertices are found, while the bunch length 


Fig.6.3 Event display obtained from the real data in the ATLAS experiment. Reprinted under the 
Terms of Use from [3] ATLAS Experiment © 2022 CERN. All rights reserved. The reconstructed 
tracks are shown in the lines, and the vertex by the circles. There are 65 proton-proton interaction 
points on top of the one where Z boson is produced and decayed into dimuon. This dimuon is shown 
in yellow lines 
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is 7.5 cm. Out of these 65 interaction points, we have to find out the vertex where 
our interesting event is generated as the next step. In the example of Fig. 6.3, two 
muons appeared from one of the 65 vertices are identified to be consistent with 
Z decay. Since such interesting events often have high momentum transfer of the 
colliding partons, called hard scattering (see Sect.2.5), the selection of the vertex 
is based on some measures that represent the momentum transfer of the collision. 
For this reason, the sum of pr of the tracks, or the number of tracks associated to 
each vertex, is frequently used. Note that the hadron collider people sometimes call 
only this hard scattered interaction point as the primary vertex, although the primary 
means vertices generated by the collisions and the secondary by the particle decays 
in the original definition. They call the other vertices induced by the collisions as 
the primary vertex “candidates” or pile-up vertices. Readers should be aware of this 
difference. 

Once we know the location of the primary vertex, or the hard scattered vertex, a 
momentum vector of the charged particle reconstructed by tracking is recalculated 
to obtain the one with respect to the primary vertex. This is one of the information 
ultimately necessary in physics analysis using charged particles. 

A special care needs to be taken to find secondary vertices, as the location of the 
secondary vertex is not known a priori, in contrast to the primary vertex finding where 
the collision point is known to some degree. The basic procedure is the same as the 
primary vertex search. Only the explicit difference is that the tracks not associated 
with the primary vertex are the candidates to form the secondary vertex. The idea here 
is simple, but this selection is sensitive to the performance of the secondary vertex 
finding, and also on the primary vertex reconstruction. For example, if the selection 
criteria are tight, i.e. if you select only the tracks whose impact parameter is far 
enough away from the primary vertex, the reconstruction efficiency of the secondary 
vertex would be poor, while the reconstructed primary vertex position is free from 
the possible bias due to the tracks emerged from the secondary vertex. Therefore, the 
optimisation based on the physics requirements is of particular importance. In the 
end, the secondary vertex finding allows us to identify the location of decay position 
of Ks, A, b-hadrons, and so on, which is the important ingredient of physics analysis. 


6.2 Electron and Photon 

6.2.1 Interactions with Materials 

Electrons and photons lose their energy by developing a characteristic shower in 
the electromagnetic (EM) calorimeter, which is called an EM shower due to the 


cascade process of the bremsstrahlung (e + materials + ey) and the et e~ pair pro- 
duction (y + materials + ete7) as shown in Fig. 6.4. Such processes are based on 
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Absorber (Calorimeter) 


Fig.6.4 Schematic views of a EM shower: a photon is injected into a calorimeter 


the interaction of electrons and photons with materials of the absorbers, for example 
lead (Pb) in the ATLAS detector and crystal (PbWO,) in the CMS detector.” 

A charged particle like an electron interacts with electrons in atoms (electrons 
of molecules) of detector materials through the EM interaction. A charged particle 
ionises or excites atoms. This is why a charged particle loses energy in materials. 
This is called ionisation loss. This energy loss can be described by the Bethe-Bloch 
formula. In addition, a charged particle radiates photons when it is decelerated in 
materials, which is called bremsstrahlung. The ionisation loss is dominant in a low 
energy region (low By) while the bremsstrahlung in a high energy region (high By). 
The energy at which the ionisation loss and bremsstrahlung energy loss is equal is 
called a critical energy. It is about 7.3 MeV for electrons in Pb. 

There are three kinds of interactions for a photon with materials: the photoelectric 
effect, Compton scattering and e* e~ pair production. In the energy of > 2melectron = 
1.022 MeV, the et e~ pair production is dominant. Once an ete” pair is created, the 
interaction of an electron with materials can be applicable. This is why the detector 
response by a photon is similar with that by an electron at the first order as mentioned 
in Sect. 5.4.2. 

The development of the EM showers stops at the critical energy. One of the key 
parameters to describe calorimeter performance is the radiation length Xo, which 
represents a length with which the energy of an electron becomes a factor of 1/e 
by passing materials. A unit of Xo is g/cm? or cm: Xo for Pb is 0.56 cm, Xo for 
PbWO; 0.89 cm. Typical EM calorimeters have at least 20X (ideally 25X0) to stop 
EM showers. Materials with smaller X9 can stop EM showers with a small space. 

The main materials for electrons and photons to lose their energy are those of 
calorimeter detectors. In most of actual collider experiments, however, there are 


2 The ATLAS EM calorimeter is a sampling calorimeter, where the absorber is Pb and the detector 
is based on liquid argon and the CMS EM calorimeter is a homogeneous calorimeter, where the 
absorber and detector parts are made of the crystal (PbWO,). 
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Fig.6.5 A photon interacts with the inner tracking detector and is converted into e*e~ (blue and 
red curves) in the ATLAS detector. The conversion vertex is shown with a brown point. Reprinted 
under the Creative Commons Attribution 4.0 International License from [4] © 2011 CERN for the 
benefit of the ATLAS Collaboration 


materials in front of the calorimeter where the bremsstrahlung, et e~ pair production 
etc. are possible, for example, a beam pipe and inner tracking detectors. An electron 
can be scattered by the Coulomb force (called the multiple scattering) or radiate 
photons via the bremsstrahlung. A photon can be converted into an et e~ pair as 
shown in Fig. 6.5, which produces oppositely charged particles with a zero-opening 
angle and imbalance in momenta. In this case, a positron and an electron are detected 
in the calorimeter instead of a photon. Such a photon is called a converted photon. The 
position where such a conversion happens, which is called a conversion vertex, can be 
obtained from these two charged tracks reconstructed in the inner tracking detector. 
When photons themselves are detected in the calorimeter without the conversion, 
they are often called unconverted photons to distinguish from converted photons. 


6.2.2 Reconstruction 


Electrons and photons are reconstructed by clustering cells of the EM calorimeter, 
where their energies are deposited in each cell. Typical algorithms are: the sliding 
window algorithm [5], the topological-clustering algorithm [6] etc. The reconstruc- 
tion of electrons and photons is based on the sliding window algorithm with a sliding 
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Fig.6.6 Schematic view of clusters for an electron in the ATLAS experiment. Reprinted under the 
Creative Commons Attribution 4.0 International License from [7] © CERN for the benefit of the 
ATLAS collaboration 2019. The 5 x 7 cells shown by yellow colour are those obtained using the 
sliding window algorithm. This is used to obtain electron energy. Other clusters (red colour) are 
obtained from the topological-clustering algorithm and are used to evaluate the isolation variable, 
which is calculated using clusters inside AR = 0.4 (a blue region) 


window seed size of 3 x 5. The topological clustering algorithm was used for the 
isolation energy calculation in the ATLAS experiment [7,8]. Figure 6.6 shows a 
cluster (yellow) by the sliding window algorithm (5 x 7) and several clusters (red) 
by the topological clustering algorithm. To perform the topological clustering, cells 
are categorised into, for example, three different classes: 40, 20 and Oo cells: the 
4o cells are those having energy of four or more times larger than their expected 
noises (o), the 2 (O)o cells are those having energy of 2 — 4 (< 2) x o. Then, in the 
step of the clustering, one of 40 cells is selected as a seed cell and its neighbouring 4 
or 20 cells in the three spatial directions are connected until there are no neighbour- 
ing 4 or 2ø cells. Then, all the surrounding Oo cells are finally connected to have 
clusters as electron and photon candidates. 

For electrons, they can be also reconstructed through charged tracks using the inner 
tracking detector information (Sect. 6.1) since they are charged particles. Clusters 
matched to a charged track are classified as electrons and those not matched to 
any charged tracks are as unconverted photons. The momentum of the matched 
charged tracks is recalculated taking into account possible energy loss due to the 
bremsstrahlung in the detectors in front of the calorimeter. A cluster matched to a 
track pair from a reconstructed conversion vertex or a single track that has no hit in 
the innermost layer of the inner tracking detector is classified as a converted photon. 
The energy calibration of electrons and photons is explained in Sect. 5.4.2. 


6.2.3 Identification 


Reconstructed electrons and photons are candidates of true electrons and photons, 
respectively. Other particles, which are background for electrons and photons, can 
be also reconstructed with the same algorithm mentioned above. Such particles are 
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Fig.6.7 Schematic views of a hadronic shower: a hadron (n, p, 2~, etc.) is injected into a calorime- 
ter 


dominated by jets, which are explained in detail later (Sect. 6.4). The origin of jets 
is a gluon and a quark, which produce a set of particles after hadronisation. A jet 
or a hadron can develop a hadronic shower in the calorimeter. The hadronic shower, 
which is shown in Fig. 6.7, has two components: hadronic and EM components. 
The hadronic component produces charged pions, charged kaons, protons, neutrons 
etc. through the hadronic interaction (strong interaction).° In addition, neutral pions 
are also produced but they are observed as photons since the lifetime of neutral 
pions (8.5 x 10717 s) is very short and they immediately decay into two photons. 
This is the EM component of a hadronic shower. 

Electrons and photons can be separated from jets and hadrons using the dif- 
ferences between an EM shower and a hadronic shower: lateral (=transverse) and 
longitudinal shower developments are different. For the lateral shower shape, an 
EM shower is relatively narrower than a hadronic shower since the constituents of a 
jet(a7~, K~, p,n, y etc.) are spread. This is the case even for a single hadron, where 
the hadronic shower can become wider with producing neutrons etc. For the longitu- 
dinal shower shape, a hadronic shower is developed into the hadronic calorimeter, in 
other words, the shower cannot stop in the EM calorimeter. For example, when there 
are some longitudinal layers in the EM calorimeter like the ATLAS detector, the 
energy deposited in outer layers of the EM calorimeter is larger for jets and hadrons 
than for electrons and photons. Variables for the identification can be defined using 
cells of calorimeters. Such variables are called shower shapes variables, for example, 
shower widths, ratios of energy deposited in different layers of the calorimeter, etc. 
Figure 6.8 shows four variables for electrons in the ATLAS experiment: w,2 and 
R, represent a kind of narrowness in the lateral direction, and Rhaq and f3 for the 
shower development in the longitudinal direction. Since more than 10 variables of 


3 Muons and neutrinos are produced from the decay of charged pions via the weak interaction: 


a+ —> u*v. The energy of these particles is largely undetected in the calorimeter. 
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Fig. 6.8 Distributions of two shower shapes for the electron identification from the ATLAS MC 
simulation studies. Reprinted under the Creative Commons Attribution 4.0 International License 
from [7] © CERN for the benefit of the ATLAS collaboration 2019. w,2 and R, are a shower width 
and a ratio of the energy in 3 x 3 cells over the energy in 3 x 7 cells in the second layer of the 
EM calorimeter, respectively. Rnaai is a ratio of the transverse energy (Er) in the first layer of the 
hadronic calorimeter to Er of the EM cluster. f3 is a ratio of the energy in the third layer to the 
total energy in the EM calorimeter. Signals are electrons from Z and J /ọy decay and backgrounds 
are from electron candidates from multijet production, y+jets etc. 


Fig.6.9 Schematic view of a fake electron from a jet: a charged pion overlaps with photons from 
a neutral pion decay inside a jet 


shower shapes and tracks (if necessary) are used, so-called a multivariate analysis 
technique such as a combined likelihood, neural network, or boosted decision tree 
is adopted. Possible reasons of misidentification for electrons and photons using the 
shower and track variables are given below. 
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Fig.6.10 £/p distributions for electrons (signal) and hadrons (background) from the ATLAS MC 
simulation studies. Reprinted under the Creative Commons Attribution 4.0 International License 
from [7] © CERN for the benefit of the ATLAS collaboration 2019. E is energy measured in the 
calorimeter and p is momentum measured in the inner tracking detector 


Jets can be misidentified as electrons (called fake electrons), for example, because 
a charged pion overlaps with photon(s) from a neutral pion decay, a n decay and so on 
inside a jet. This is illustrated in Fig. 6.9. One of useful discriminating variables for 
this type of fake electrons is E /p as shown in Fig. 6.10, where E is energy measured 
in the calorimeter and p is momentum measured in the inner tracking detector. In 
case of true electrons, it should be close to 1 because E and p should originate from 
a same object but in case of jets (fake electrons) there is no clear correlation between 
E and p because different particles can contribute to E£ or p. This variable is included 
in the electron identification. 

Not only jets but also other objects such as t-jets and converted photons are 
misidentified as electrons. t-jets are misidentified when it decays hadronically to 
one charged particle, so-called one-prong (ex. t > 1*z°v) with the same reason 
as jets, that is, the overlap between z+ and y from °. A simple t-veto algorithm is 
applied: electron candidates that are highly identified as t-jets in a t identification 
are rejected. For the converted photons, a cluster has a possibility to have a matched 
charged track when one of the charged particles is not reconstructed. To reduce such 
misidentification (see Fig. 6.11), a track is required to associate to a primary vertex 
using impact parameters since tracks from a conversion vertex have large impact 
parameters. In addition, a hit in the innermost layer of the inner tracking detector is 
required for electron candidates. 

Jets are misidentified as photons (called fake photons), for example, when a neutral 
pion from a jet carries most of energy of the jet. This is illustrated in Fig. 6.12. In 
principle, two photons should be observed inside a jet because a neutral pion decays 
into two photons. To separate a single photon from a set of two photons, the finely 
segmented first layer is used in the ATLAS experiment as shown in Fig. 6.13. A 


single cluster is observed for a photon (left) but two clusters for a °°. 
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Fig.6.11 Schematic views of conversions at the first layer (left) and at the beam pipe (right). When 
two tracks are reconstructed, both cases are categorised into conversions. On the other hand, in case 
one of the tracks is misreconstructed, only the left is taken as a conversion to reduce fake converted 
photons. No hit in the first layer of the inner tracking detector is required for conversions 


Fig.6.12 Schematic view of a fake photon from a jet: most of jet’s energy is carried by a neutral 
pion 


Fig.6.13 Energy deposits in the three layers in the ATLAS EM calorimeter for a photon (left) and 
two photons from a 7° (right). Reprinted under the Terms of Use from [9] ATLAS Experiment 
© 2022 CERN. All rights reserved. Photons are injected from the bottom to the top. The energy 
deposits are shown by yellow. In the right figure, two groups of the energy deposit are observed in 
the first layer (fine segmentation) of the calorimeter 
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Muons can be produced via decays of Higgs bosons, W/Z bosons, quarks and new 
particles such as SUSY. Therefore, the reconstruction and identification of the muons 
with good quality in the wide range of momentum and solid angle are key to many 
of the most important physics in the energy frontier experiment. 

The muon belongs to the second-generation lepton. The characteristics are similar 
to the electron except for the mass, with an electronic charge of —e, a spin of 1/2, 
a mass (m,,) of 105.6583715 + 0.0000035 MeV. Muons hardly make either elec- 
tromagnetic or hadronic shower in our energy regime, but decay into an electron, 
an electron anti-neutrino, and a muon neutrino via the weak interaction. Therefore, 
the mean lifetime of the muon (t,,), even flying in material, is very close to that 
in vacuum, 2.1969811 + 0.0000022 us, which is relatively long. Muons with the 
momentum (p,,) of 1.0 GeV can pass through around 70 m at a period of the mean 
lifetime in the laboratory frame; 


p 2 1.0 (GeV) 
ct py = CT = 3 x 10° (m/s) x 2.2 x 107° (s) x 0.105 (Gev) = 70 (m) 
i ; 
(6.1) 


v 
where the c is a velocity of light, 8 = — is a ratio of the velocity of the muon to the 
c 


1 
light velocity, and y = SS is a Lorentz boost factor. 
1 


Considering these unique characteristics of the muon, in the collider experiments, 
muons can be detected by charged particle detectors located at both inside and outside 
of calorimeters. In this section, the muon identification and reconstruction in the 
collider experiments are described using the ATLAS detector as an example. 


6.3.1 Muon Momentum Measurement 


Muon reconstruction and identification in ATLAS relies on inner tracking detector, 
described in Sect. 3.3.2, and muon spectrometers (MS). The track reconstruction is 
first independently performed in inner tracker and MS. The information from both of 
them is then combined to form the muon tracks that are used in the physics analysis. 


6.3.1.1 Effect of Multiple Scattering 

When a muon passes through the large volume of the materials in the detectors, the 
effect of the multiple scattering needs to be taken into account. The multiple scattering 
angle is regarded as the accumulation of the Rutherford scattering. The probability 
of single Rutherford scattering is in inverse proportion to sinf (3), where the 6 is 
the scattering angle. The scattering angle has a sharp peak at 0 = 0, meaning that 
0 is typically very small. The mean of the multiple scattering angle is statistically 
regarded as the accumulation of the small angle Rutherford scattering, shown as 
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(07) = = 0?. The multiple scattering angle approximately distributes in the Gauss 
i 

distribution. The effect of the large angle scattering that also occurs for sin* (5) 

distribution is shown up in the tail of the distribution. The mean of the multiple 

scattering angle (0o = y (92)) can be expressed as 


P 13.6 MeV fx [i + 0.0381 ( x ) 1 [x (6.2) 
= —— __ | — ’ n| =|| x =/=, : 
o Bep V Xo Xo DV Xo 


where p and £c are the momentum and velocity of a muon, respectively, and x /Xo is 
the thickness of the scattering medium in radiation length (Xo). If the uncertainty of 
the muon position measurements, oy in Eq. (5.2) or (5.3) dominated by the multiple 
scattering with the detector materials, the momentum resolution is independent of 
PT as 


Opr 
PT 


1 
X Ox: PT X 0o: pr X —- pr & const. (6.3) 
PT 


6.3.1.2 Contributions to Muon Momentum Resolution 

The uncertainty of the position measurement oy usually comes from the accuracy of 
the hit position measurement limited by the detector characteristic, misalignment of 
detectors, multiple scattering in the detector, and fluctuations in the energy loss of the 
muons traversing through the material in front of the spectrometer. Figure 6.14 shows 
the contributions to the momentum resolution for the ATLAS MS as a function of 
transverse momentum [10]. The contribution of the multiple scattering is independent 
of the transverse momentum and dominated at moderate momentum (30 < pr < 
300 GeV), while the contributions of the hit position resolution (denoted as “Tube 
resolution and autocalibration” in Fig. 6.14) and the detector (chamber in this case) 
alignment are in inversely proportion to pr and dominated at high momentum (pr > 
300 GeV). At low momentum (pr < 30 GeV), energy loss fluctuations become 
dominant. 

The ATLAS MS is designed to detect muons in the pseudorapidity region up to 
|n| = 2.7 and to provide momentum measurements with a relative resolution better 
than 3% over a wide pr range and up to 10% at pr ~ 1 TeV. In order to satisfy the 
requirements, the measurement precision in each hit by a muon track is required to 
be typically better than 100 um, which can be roughly estimated by Eq. (5.2). The 
uncertainty of the alignment in the chamber positions is required to be at the level 
of 30 um. 

Figure 6.15 shows muon momentum resolution for MS alone and for the combined 
measurements by MS and inner tracker [10]. At low momentum (pr < 30 GeV), 
the measurement by inner tracker is better due to better spatial resolution of silicon 
strip and pixel detectors. On the other hand, at high momentum (pr > 30 GeV), the 
measurement by MS becomes better than inner tracker because the MS is stationed 
in a wider space, which means L is larger in Eq. (5.2). 
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Fig. 6.14 Contributions to 
the muon momentum 
resolution for the ATLAS 
MS as a function of 
transverse momentum. 
Reprinted under the Creative 
Commons Attribution 3.0 
License from [10] © 
1997-2022 CERN 


Fig.6.15 The muon 
momentum resolution for the 
muon spectrometer alone and 
the combined measurements 
by the ATLAS MS and the 
ATLAS inner tracker as a 
function of the transverse 
momentum. Reprinted under 
the Creative Commons 
Attribution 3.0 License from 
[10] ATLAS Collaboration 
© 1997 CERN. The dashed 
curve is the resolution using 
only the inner tracker 
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6.3.2 Examples of Muon Detectors 


Because the muon spectrometers have to cover the wide surface area of the barrel and 
endcap of the cylindrical detector system, it is required to be robust, mechanically 
strong, and inexpensive as well as to provide the good momentum resolution and 
the high efficiency. Because muons give us clear signatures from physics of interests 
such as H + ZZ* — 4u, the muon spectrometers are used as the trigger devices 
which provide fast information on momenta, positions and multiplicity of muons 
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Fig. 6.16 The muon spectrometer for the ATLAS experiment. Reproduced by permission of IOP 
Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved 
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Fig.6.17 The cross-section of the ATLAS muon spectrometer: r-z view (Left) and r-@ view (Right). 
Reprinted under the Creative Commons Attribution 3.0 License from [10] ATLAS Collaboration 


© 1997 CERN 


traversing through the detector. This is called as the first level muon trigger, which 
makes a trigger decision within a few micro-seconds by a simple trigger logic on 
hardware. The gas detectors satisfy these requirements. For instance, the ATLAS 
MS, shown in Figs. 6.16 and 6.17, consists of the resistive plate chambers (RPC) 
and the thin gap chambers (TGC) to provide the fast muon trigger information and 
the monitored drift tube (MDT) chambers and the cathode strip chambers (CSC) 
to reconstruct muon trajectory precisely. The ATLAS MS divided into a barrel part 
(In| < 1.05) and two endcaps (1.05 < |n| < 2.7). 

Three large superconducting air-core toroid magnets provide magnetic fields with 
a bending integral of about 2.5 T-m in the barrel and up to 6 T-m in the endcaps in 
order to measure the muon momentum independently to the inner tracking system 
with the solenoid magnet (Fig. 6.18). In the following sections, as an example of 
the muon chamber, RPC, TGC, MDT used in ATLAS muon spectrometers, are 
introduced [11]. 


6.3 Muon 91 


8m 


a 
End-cap 


Barrel region 
atte FeO region 


Je di (T-m 
A 
T T F 
o 
T 
a 
N 
Transition region 


tt oe 


Fig. 6.18 The magnetic fields provided by ATLAS toroid magnet. Reproduced by permission of 
IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved 


6.3.2.1 Resistive Plate Chamber 

In the barrel region (|| < 1.05), trigger signals are provided by a system of resis- 
tive plate chambers (RPCs). The RPC is a gaseous parallel electrode-plate detector 
providing a typical space-time resolution of 1 cm x 1 ns with digital readout. The 
mechanical structure of an RPC is shown in Fig. 6.19. Two resistive plates, made 
of phenolic-melaminic plastic laminate, are kept parallel to each other at a distance 
of 2 mm by insulating spaces. The gas gaps are filled with the gas of a mixture of 
C2H2F4/Iso-C4H10/SF6 (94.7/5/0.3). The electric field between the plates of about 
4.9 kV/mm allows avalanches to form along the ionising tracks towards the anode. 
Since all primary electron clusters form avalanches simultaneously in the strong and 
uniform electric field, single signal is produced instantaneously after the passages 
of the particle. The intrinsic time jitter is less than 1.5 ns. The signal is read out via 
capacitive coupling to metallic strips, which are mounted on the outer faces of the 
resistive plates. The total jitter of RPC is less than 10 ns, which ensures to identify the 
proton bunch crossing of 25 ns and to produce fast trigger signals. The readout pitch 
of 7 and ¢-strips is 23-35 mm. The 7 and ¢ strips provide the bending view of the 
trigger detector and the second-coordinate measurement, respectively. The second- 
coordinate measurement that cannot be done by MDT chambers (see Sect. 6.3.2.3) 
is also required for the offline pattern recognition. 

RPC is made up of three stations, each with two detector layers. Two stations 
installed at a distance of 50 cm from each other are located near the centre of the 
magnetic field region and provide the low- pr trigger (pr > 6 GeV) while the third 
station, at the outer radius of the magnet, allows to detect the muon trajectory with 
larger curvature and to increase the pr threshold to 20 GeV, thus providing the high- 
pt trigger. The trigger logic requires three out of four layers in the middle stations 
for the low- pr trigger and, in addition, one of the two outer layers for the high- pr 
trigger (Fig. 6.20). 
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Fig.6.19 Mechanical structure of an RPC chamber. Reproduced by permission of IOP Publishing 
from [11] © IOP Publishing Ltd and SISSA. All rights reserved. The unit of the number in the 
figure is mm 
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Fig. 6.20 Cross-section of the upper part of the barrel muon spectrometer. Reproduced by per- 
mission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved. Two 
stations of the RPC are below and above middle station of MDT chamber. Outer station is above 
the MDT in the large and below the MDT in the small sectors. Dimensions are in mm 
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Fig. 6.21 TGC structure showing anode wire, graphite cathodes, G-10 layers and a pick-up strip, 
orthogonal to the wires (top) and cross-section of a TGC triplet and doublet module (bottom). 
Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All 
rights reserved 


6.3.2.2 Thin Gap Chamber 

In the endcap region (1.05 < |n| < 2.4), trigger signals are provided by a system 
of thin gap chambers (TGCs). TGC is multi-wire proportional chambers with the 
characteristic that the wire-to-cathode distance of 1.4 mm is smaller than the wire- 
to-wire distance of 1.8 mm, as shown in Fig. 6.21. The gas used is mixture of CO2 
and n-C5H12 (n-pentane) (55 : 45). TGC is operational in quasi-saturated mode with 
a gas gain of about 3 x 10°. The high electric field of the wires (around 2800 V) and 
small wire-to-wire distance allows us to measure the muon trajectory with a good 
time resolution and to identify the proton bunch crossing of 25 ns. The number of 
wires in a wire group varies from 6 to 31 as a function of 7, in order to match the 
granularity to the required momentum resolution. The wire groups measure the 7 
direction of the muon trajectory. Two of copper layers in triplet and doublet modules, 
which is marked as “Cu stripes” in Fig. 6.21, are segmented into readout strips to 
read the azimuthal coordinate (¢) of the muon trajectory. 

The inner wheel formed by doublet modules is placed before the endcap toroidal 
magnet, while the big wheel consists of the seven layers (triplet module plus two 
doublet modules) as shown in Fig. 6.22 and measures the muon trajectory in the 
bending direction by toroidal magnet. 
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Fig.6.22 Big wheel of TGC chamber. Reprinted under the Terms of Use from [12] ATLAS Exper- 
iment © 2006 CERN. All rights reserved. The diameter of the Big wheel is about 25 m 


6.3.2.3 Monitored Drift Tube 

Over most of the 7 range, a precise measurement of the track coordinates in the 
principal bending direction of the toroidal magnetic field is provided by monitored 
drift tubes (MDT) chambers. The MDT system achieves a sagitta accuracy of 60 pm, 
corresponding to the momentum resolution of about 10% at pr = 1 TeV. 

The basic element of the MDT is pressurised drift tube with a diameter of 
29.970 mm, operating with Ar/CO2 gas (93% : 7%) at 3 bar. The electrons resulting 
from ionisation are collected at the central tungsten-rhenium wire with a diameter 
of 50 um at a potential of 3080 V as shown in left figure of Fig. 6.23. The average 
drift velocity of electrons is about 20.7 um/ns and the maximum drift time is about 
700 ns. Making use of the radius-to-drift time relation (r-t relation), the distance of 
a muon track passing through the tube from an anode wire can be measured as a drift 
circle. The shape of the r-t relation, which depends on parameters such as tempera- 
ture, pressure, magnetic field distortions caused by the positive ions after ionisation, 
must be known with high accuracy in order to achieve better spatial resolution. 

The mechanical structure of an MDT chamber is shown in right figure of Fig. 6.23. 
A chamber consists of two multi-layers of three or four drift tube layers. In order 
to monitor the internal geometry of the chamber, four optical alignment rays, two 
parallel and two diagonal, are equipped. That is why the drift tube detector in the 
ATLAS experiment is called “Monitored” Drift Tubes. The 1,150 MDT chambers are 
constructed from 354,000 tubes and cover an area of 5,500 m2. Each MDT chamber 
provides the information of the track segment. Muon tracks are reconstructed by 
track segments obtained from inner, middle and outer stations of MDT chambers. 
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Fig.6.23 Left: the cross-section of the MDT drift tube. Right: the mechanical structure of a MDT 
chamber. Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. 
All rights reserved 


6.3.3 Muon Reconstruction 


The muon reconstruction can be performed independently in the inner tracker and 
MS. In the inner tracker, the muons are reconstructed such as any other charged parti- 
cles described in Sect. 6.1. In this section, the description of the muon reconstruction 
in the MS and the combined muon reconstruction are focused on. More detail on the 
muon reconstruction at the ATLAS experiment is given in Ref. [13]. 

Using the drift circles in MDTs or clusters in TGCs and RPCs, the muon recon- 
struction is subdivided into the three stages: segment-finding, segment-combining 
and track-fitting. 

Segment-finding starts with a search for hit patterns in a single station (i.e. inner, 
middle and outer stations of MDT, RPC and TGC chambers in case of the ATLAS 
MS) to form the track segments. The Hough transform is used to search for hits 
aligned on a trajectory in the detector. The track segments are reconstructed by a 
straight-line fit to the hits found in each layer. 

Full-fledged track candidates are built from segments, typically starting from 
middle stations of detector where trigger hits from TGC or RPC are available, and 
extrapolating back through the magnetic field to the segments reconstructed in the 
inner stations. Whenever a match of the segment is found, the segment is added 
as the track candidate. The final track-fitting procedure takes into account all rele- 
vant effects: multiple scattering, non-uniformity of the magnetic field, inter-chamber 
misalignment etc. 

The physics analyses make use of four muon types. 


e Combined muon: muon tracks reconstructed by the inner tracker and MS inde- 
pendently are combined with a global refit using the hits from the inner tracker 
and MS detectors. In order to improve the fit quality, MS hits may be added 
to or removed from the track. Most muon tracks are reconstructed by outside- 
in reconstruction, where the muons are first reconstructed in the MS and then 
extrapolated inward and match to a track reconstructed by the inner tracker. An 
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inside-out reconstruction where the reconstruction procedure is opposite to the 
outside-in reconstruction is also used as a complementary approach. 

e Segment-tagged muons: a muon track in the inner tracker is classified as a muon 
if it is associated with at least one local segment in the MDT stations. In case 
of low pr muon or in case muons pass through the 7 — @ region, which is not 
covered by MS stations, segment tagged muons are used. 

e Calorimeter-tagged muons: a muon track in the inner tracker is identified as a 
muon if it is associated with an energy deposit in the calorimeter compatible with 
a minimum-ionising particle (MIP). Muons passing through the 7 — @ region 
where MS is not fully covered are regarded as this type of muons. 

e Extrapolated muons: muon tracks reconstructed based only MS and a loose 
requirement on compatibility with originating from the interaction point. The 
tracking parameters of the muon are defined at the interaction point, taking into 
account the estimated energy loss of the muon in the calorimeters. Extrapolated 
muons are used to extend the acceptance for the muon reconstruction into the 
region where the inner tracker does not cover. 


When the same track reconstructed by inner tracker is identified by two muon 
types, the priority is given to the combined muons, then to segment-tagged muons, 
and finally calorimeter tagged muons. 


6.3.4 Muon Identification 


Although muon candidates reconstructed by the muon spectrometers are mostly true 
muons, we want to identify the origin of muons. Muons from the decay of heavy 
particles such as W, Z, Higgs bosons, or new particles are interesting for us and 
need to be reconstructed as “isolated” muons efficiently and precisely. Since muons 
from semi-leptonic decays from b and c-hadrons and t are also important for the 
b-tagging and t ID, respectively, they need to be reconstructed as muons in the heavy 
flavour jets and ts. On the other hand, muons from the decays of pions and kaons 
are regarded as “fake” muons and eliminated from muon candidates. 

Muon candidates originating from in-flight decays of charged hadrons mainly 
from pion and kaon decays in the inner tracker are reconstructed with a distinctive 
kink in the track. Therefore, it is expected that the track fit quality of the resulting 
combined track is poor and that the momentum measured by the MS and the inner 
tracker are not compatible. Muon identification is performed by applying quality 
requirements to suppress the background, to select prompt muons with high effi- 
ciency, and to guarantee a robust momentum measurement. Based on the number 
of hits in the inner tracker and MS, x? of the combined muon tracks, the difference 
between the transverse momentum measurements in the inner tracker and MS and 
their uncertainties are used to classify as “Loose”, “Medium”, “Tight” and “High pr” 
(for high momentum muons above 100 GeV aimed at the muons from exotic particle 
such as Z’ and W’ bosons) categories. These categories are provided to address the 
specific needs of different physics analyses. 
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6.3.5 Muon Isolation 


Muons originating from the decay of heavy particles such as W, Z or Higgs bosons 
are often produced isolated from the other particles, in contrast to the muons from 
semi-leptonic hadron decays such as b —> cuv, which are embedded in jets. The 
measurement of the detector activity around a muon candidate, referred to as muon 
isolation, is a powerful tool for background rejection in many physics analyses. Both 
track-based and calorimeter-based isolation variables are often used. 

The track-based isolation variable pee is defined as the scalar sum of the 
transverse momentum of the tracks pr >1 GeV in a cone size AR = 
min (10 GeV/ Pri 0.3) around the muon. The muon momentum Pe is excluded 
from py*"°"39_ Tn this case, the cone size is chosen either to be pr dependent 
(AR = 10 GeV/ Pr) or to be pr independent (AR = 0.3). The pr dependent cone 


size is used to improve the performance for the isolated muon with a high transverse 
i f i ; t 2 
momentum. The calorimeter-based isolation variables ES pocone20 are defined as the 
sum of the transverse energy of topological cluster in a cone size AR = 0.2 around 
the muon. The isolation selection criteria are determined using the relative isolation 
f i t 20 ; ne 
variables defined as oe / pr and Er a Dr Several selection criteria are 


provided to address the specific needs of different physics analyses. 


6.3.6 Momentum Scale and Resolution 


Although the simulation contains the description of the detector, there is a limi- 
tation in describing the momentum scale and the momentum resolution. For this 
reason, corrections of simulated values are often applied. The momentum scale and 
resolution are parameterised by the following equation: 


1 
a + > Sn (n, $) x (p¥S)" 


pe% = 5 n=0 (6.4) 


1+ 0 Arm (n.d) x (PMO gn 


m=0 


where is the uncorrected transverse momentum in simulation, gm is normally 
distributed random variables with zero mean and unit width, and the Arm (n, @) and 
Sn (n, $) are the parameters representing the smearing of momentum resolution and 
the scale corrections applied in a specific (7, @) detector region, respectively. 

The corrections to the momentum resolution are described by the denominator of 
Eq. (6.4), assuming that the relative pr resolution can be parameterised by 


OD) 2 Oe ear (6.5) 
PT PT 
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with ® denoting a sum in quadrature. As shown in Sect. 6.3.1, the second and 
third terms of Eq. (6.5) account mainly for multiple scattering and the resolution 
effects caused by spatial resolution of the hit measurements and the misalignment 
of the muon spectrometer. The first term accounts for fluctuation of the energy loss 
in the detector material. The difference in the momentum resolution between data 
and simulation is parameterised by Ar,,(7, @). The momentum in simulation is 
smeared with the Ar,,(7, @), by dividing uncorrected muon momentum by the term 
of denominator in Eq. (6.4). 

The numerator in Eq. (6.4) describes the momentum scales. The s;(7, Ø) corrects 
for inaccuracy in the description of the magnetic field integral and the dimension of 
the detector in the direction perpendicular to the magnetic field. The so(7, @) corrects 
the energy loss in the detector material. 

The momentum scale and resolution are usually studied using J/w —> jy and 
Z — upu decays. Since the J/w and Z are narrow resonances and their masses 
are well known, the distributions of invariant mass reconstructed by two w’s from 
J/w and Z show clear peaks around 3 GeV and 91 GeV [14], respectively. Further- 
more, the number of non-resonant background events from decays of light and heavy 
hadrons and from continuum Drell-Yan production is very small. The momentum 
scale and resolution are determined from data using a fit with templates derived 
from simulation, which compares the invariant mass distributions from J/w > wu 
and Z — upu candidates in data and simulation. The momentum in the range of 
5 GeV < pr <20 GeV and 20 GeV< pr <300 GeV is corrected by J/w > upu 
and Z — uu candidates, respectively. Figure 6.24 shows the invariant mass dis- 
tribution of J/y —> uu (left) and Z —> jy (right) candidate events reconstructed 
with combined muons [13]. The agreement between data and simulation becomes 
much better after the correction. 
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Fig.6.24 Dimuon invariant mass distribution of J/w —> jy (left) and Z > py (right) candidate 
events reconstructed with combined muons. Reprinted under the Creative Commons Attribution 
4.0 International License from [13] © CERN for the benefit of the ATLAS collaboration 2016. The 
upper panels show the invariant mass distribution for data and for the signal simulation, and for 
background estimate. The points show the data, the continuous line shows the simulation with the 
corrections of momentum scale and resolution, and the dashed lines show the simulation without 
the corrections 
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6.4 Jet identification 
6.4.1 Fragmentation: Partons to Particles 


This section describes the identification and reconstruction of jets. A jet in high- 
energy physics is, naively speaking, a bunch of hadrons, which are emitted in nearby 
directions. This is an object consisting of the consequence of parton(s) fragmented 
into multi-hadron states. Here gives a short introduction of how we understand the 
“fragmentation” process, i.e. the underlying physics of the partons transformed into 
long-lived hadrons, followed by discussion on algorithms to identify and reconstruct 
jets. 

The partons, i.e. quarks and gluons, obey the dynamics described by QCD with 
one and only parameter, the strong coupling constant as. The coupling constant 
becomes smaller with the energy of the interaction as a consequence of the renor- 
malisation group equation as shown in Fig. 2.6. The energy scale, denoted as m, is 
given as the centre-of-mass energy of the partons in concern. Since the energies 
involved in each parton reaction is not measurable, the choice of the energy scale for 
a process is, however, not uniquely given and we leave the discussion to elsewhere. 
Here, we merely point out that there are many choices: it could be centre-of-mass 
energy or transverse momentum of two partons when discussing on the parton-parton 
collisions, often quadratically summed with a heavy quark mass if a heavy quark 
is involved, or the mass of the particles (W, Z, Y ...) when discussing the decay of 
particles. 

Now let us take a simple example, a decay of the Z? boson into a qq pair for 
understanding how a parton fragment into a multi-hadron state. The energy scale 
would be given as u = mz. In this case, the quarks cannot be a pair of top quarks 
due to energy conservation and the quarks run fast and may radiate additional gluons, 
since a quark feel the force from the other quarks due to the colour charge carried 
by each of quarks. The force is “strong”, as as is about 0.1 at the mass scale of 
mo. The radiation of the gluon is soft in most cases, i.e. typically collinear and/or 
with small momentum fraction with respect to the parent quark, like for the case of 
bremsstrahlung. But with a small probability, the gluon may have large angle from 
both of the two quarks and may have large momentum fraction, i.e. the radiated 
gluon may be hard. 

The gluons and quarks still feel the colour force and may further radiate a pair of 
qq, g > qq or radiate further a gluon g — gg. The splitting of partons would be 
repeated: this process is often called “parton shower”. After some steps of radiation, 
the partons are branched into many parton states, most of which have another close-by 
parton and the invariant masses between these two partons are much smaller than the 
initial mass m z0. In such a situation, the coupling constant describing the interaction 
of the partons becomes much larger, say 0.3 rather than 0.1. This accelerates the 
process of the fragmentation and eventually all the partons would have their invariant 
mass with the nearest partons below | GeV. This is the energy scale of Agcp, below 
which perturbative QCD (pQCD) is no longer applicable; one cannot discuss the 
branching of partons by perturbation theory and need a help of the non-perturbative 
approach, e.g. the lattice QCD. 
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The lattice QCD calculation tells us that the potential energy of a gq pair is linear 
to the distance r between the pair, U(r) « r. This means that the line of the strong 
force is about constant density and concentrated in a tube-like area. The stored energy 
gets higher as the distance becomes larger. Once the stored energy exceeds beyond 
the mass of two quarks, the total energy should be lower if a gq pair is produced and 
the force lines are cut. This process continues until the relative distances between 
all the colour-neutral gq pair become shorter that there remains no more enough 
energy to produce additional gq pair, giving the end of the showering process. All 
the quarks and gluons are in bound states, i.e. mesons or baryons, at this stage. This 
last part of the transition is called hadronisation. 

Prior to the discussion on jet algorithm, we may like to see if the input to the 
algorithm, the four momentum of the final state objects, is well defined. The final 
state with hadrons can clearly be defined once we give a threshold on the lifetime 
of the final state particles. The boundary is often given at where the B—mesons and 
charm mesons decay but not charged pions. The intermediate states of partons before 
hadronisation, on the other hand, are less obvious in their definition and we need 
certain criteria, which we discuss in the following section. One should also note 
that the fragmentation is a process described by quantum field theory and it is not 
possible to assign a certain parton or hadron to their parents in principle—what we 
know through the theory is the probability to which parents the daughter particle is 
assigned, unless the lifetime of the parent particle is long enough that the quantum 
effect is negligible. 


6.4.2 Defining Jets 


While it is impossible to have unique one-to-many correspondence between a parton 
and hadrons, one may still imagine that a spray of particles, or a jet, would be 
originated from a quark or a gluon, if it looks like collimated and away from the 
other activities, and may like to relate the jet to the underlying parton. This relies on 
the fact that the parton emission is mostly collinear if the parents of the hadronised 
partons have sufficiently high energy and run fast. In practice, however, the jets 
are still “ambiguous”: if there are two close-by jets or a wide jet, it is often not 
straightforward to know whether to associate a parton or two or more partons to the 
jets. A hadron away from the sprays may also be ambiguous in such an assignment 
or left unassociated to any jets. 

Things are even more complicated if we are to reconstruct jets using the detector 
information. Not only the momentum of the particles are smeared by the detector 
but also that a significant part of the particles may be escaped from detection. Also, a 
measurement by calorimetry cannot resolve two close-by hadrons since their energy 
clusters may be merged to a cluster if the distance between the two hadrons at the 
calorimeter is less than a certain value. 

This also applies when we extend the concept of the jets to partons. Parton branch- 
ing is also a quantum-mechanics process and the final state partons are not uniquely 
related to their parent partons, as described above. 
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These facts all call for some definition of jets, or a jet algorithm that defines the 
number of jets and their momenta. The algorithm has to be independent of the type 
of particles: a few “primary” partons, many partons after the parton shower, hadrons 
before or after the meson decays, or detector measurements. The algorithm should 
also identify, count and reconstruct the momentum of hard, i.e. high momentum 
partons while the soft emissions nearby hard partons should be absorbed to the hard 
partons, or discarded. In this sense, the algorithm should be insensitive to the soft 
emissions, or ‘infrared safe’. In fact, the procedure to absorb the soft particles to the 
stronger jets is somewhat analogical to the procedure of renormalisation in theoretical 
calculation. 


6.4.3 Jet Algorithms 


Historically, there have been two kinds of jet algorithms used in high-energy physics, 
one called the cone algorithm and the other the cluster algorithm. The cone algo- 
rithm moves around a window of jet area defined as a circle in 1 — @ space, 
Ar = y An? + Ad? < R, where the A variables Ar, An and A¢ are the distance 
between the jet centre and the position of the particle. R is called cone radius, giv- 
ing the angular boundary of jets. The algorithm iteratively finds such an energetic 


jet be pe ; 
cluster where the sum of the transverse momenta pY for the particle inside the cir- 


cle becomes maximum. The cluster with transverse momenta is said to be a jet if 


pk t> př, the threshold value of the jets, which is the parameter to ensure that the 


jets are hard. After the algorithm is run, one may find many jets but also many particle 
clusters below the eS, which are not qualified to be a jet and discarded. This feature 
is suitable for hadron-hadron collisions where hard jets are accompanied with many 
soft particles arising from soft emissions from beam remnants, particles from soft 
underlying events (rescattering of the outgoing proton remnants) and multi-parton 
interactions, those often called “underlying events”. In addition, particles from soft 
collisions pile up to the hard partons in case of high-luminosity collisions such as 
the main LHC runs (see Sect. 6.4.4). 

There still remains the activity of particles not emerged from the hard partons 
within the cone. Although the amount of the underlying events and pile-up particles 
are certainly not constant, it is at least possible to statistically subtract such contri- 
butions since the size of the jet is the same for each jet unless they overlap and some 
part of the jet area is to be shared; even for such a case the net area of overlapped 
jets is still well defined. This is another virtue of the cone algorithm. 

The historical cluster algorithm, on the other hand, assigns all the particles to one 
of the jets. This is suitable for et e~ collisions where there are neither beam remnants, 
multi-parton interactions nor pile-up but only the soft emissions from hard partons. 
The basic idea is that the soft particles should always be merged to other particles 
or nearest cluster until the energy of the cluster exceeds beyond the threshold (see 
Fig. 6.25). The “distance” between two particles, dj; is defined in various ways, which 
gives the variation and choice to the algorithm. The distance can be an invariant mass 
squared between two particles (the original JADE algorithm) or relative transverse 
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Fig. 6.25 A schematic drawing, showing how clusters with the nearest distance are merged each 
other to form a new cluster 


momentum squared of a softer particle with respect to the harder particle (often called 
ky or k1): dij = min( De pi). The algorithm sequentially combines two particles 
of the nearest distance. In each step of the combination, the four momentum of 
the two merged particles is calculated to form a new particle. There are also many 
choices in how to combine the momenta of two particles (called “recombination 
scheme”’)—either the merged particle is massless, massive, conserving energy or 
not etc. The definition of distance and recombination scheme should be chosen such 
that the jet observables in concern (momentum, energy, number, mass etc.) are well 
reproduced, and the choice may vary with energy and type of the interaction. The 
combination is stopped until the distance between two jets, defined as y = d;;/M 
become above yey;, where M is normally chosen as the invariant mass of the first 
two outgoing partons from the e*e~ collisions, which equals to the centre-of-mass 
energy of the ete~ collisions in most of the cases, except for the events with hard 
initial state radiation of photons. 

The biggest advantage of the cluster algorithm against the cone algorithm is that 
there is no ambiguity in the algorithm originated from the iterative procedure in the 
cone algorithms. The cone algorithm needs seeds to start the iteration. It is well 
known that the number of jets and jet momenta is largely affected by the choice 
of the property of the seed (pr threshold and the cone size to define a seed) when 
particles are densely populated. A jet could be split into two jets depending on the 
seed choice. This means that the result of the jet finding is affected by soft particles 
(which could be the seed). It is known that a naive cone algorithm is not infrared 
safe, i.e. the result of the algorithm may depend on a presence of a particle with 
infinitesimally small energy. 

A new class of cluster algorithms for hadron-hadron collisions are then invented, 
by taking virtue of the cone algorithms, (a) the algorithm works on 7 — ¢ — pr 
space so that it is boost invariant and (b) particles below the threshold are discarded. 
A typical arrangement is to introduce two particles with infinite momentum on the 
beam axis. In the algorithm, the distance between the beam particles d; is defined as 
di = pe where pi is the transverse momentum of the ith particle, in addition to kr 
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Fig. 6.26 A schematic dij > di: merged to 
drawing, showing how 
clusters with large momenta 
are classified: to be merged 
each other to form a new 
cluster, or to the beam axis to 
be considered as a jet 


beam axis 
on the next step 
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beam axis 
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dij < di : these will be 
merged on the next step 
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as the distance parameter between two final state particles, d;;. If d; of the particle, 
the distance to the beam axis, is smaller than any of dj;’s, the distances to the other 
final state particles, the particle is merged to the beam particle. 

Also, dj; is adjusted to the hadron collider environment. The first version of the 
algorithm, the kr algorithm, uses the distance parameter as 


dij = min(pi, py) Ar; /R? (6.6) 


where Ar is that used in the cone algorithms, Ar = y An? + Ad?, and R is the 
radius parameter. The particle with the smallest pr will be merged to the beam if 
Ar is more than R for any other particles, since then d; would be smaller than any 
of dij. It will be merged to the nearest particle if Ar < R. In this way, the parameter 
R plays the role of the cone radius in cone algorithms (Fig. 6.26). 

The value of R gives the angular size of the jets. This is a parameter to which 
extent one allows to include hard parton radiation around the primary parton, in 
addition to the soft emissions and/or collinear part of radiated partons. The size should 
not be too small to include the soft/collinear particles but should not be too large 
since the particles from beam-related activities (soft underlying events, multi-parton 
interactions and pile-up particles) may come more into the jet area. Typical values 
used for the QCD studies at the energy scale of weak interactions ( p% ~mz/2) 
is 0.6 — 0.7 to include soft emission originated from the parent partons, in order 
to reduce the theoretical uncertainty in pQCD description of the data. For higher 
energy interactions, 0.4 — 0.5 would be more preferred, in particular for physics 
beyond the SM (BSM) searches at TeV scale, to minimise the effect of soft particles 
to the momentum or mass reconstruction of the parent BSM particles. 

The clustering procedure is finished when there is no possibility to merge the 
remaining particles each other except for the beam particle. The particles above a 
given pr threshold is defined as jets and others are discarded, i.e. merged to the beam 
particle. 

The original kr algorithm, where dj; is defined as Eq. (6.6), it is known that the 
jet area tends to be extended beyond the area given by the parameter R. This feature 
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is undesirable for the mass reconstruction as discussed just above. Recently anti-ky 
algorithm became more popular, where dj; is defined as: 


dij = min(( P) ?, (pp) 7) Ar; /R? 


With this distance parameter, the algorithm first merges the pairs within the maximum 
allowed distance, Ar;; ~ R, making a merged particle in between. The same thing 
happens in the next iteration: the furthest particle from the new merged particle would 
be absorbed. This would imply that the direction of the jet particle, or the jet axis, 
would be oscillated between the merged particles, but the axis will be stabilised in 
the later stage where only particles with small kr with respect to the jet axis are left, 
which are eventually be merged to the jet. As a consequence, the jet area will have 
clear boundary of a circle with the radius R. The area of overlapping circles will be 
absorbed to more energetic jets. This would give the jets very similar to what is given 
by the cone algorithm, which has certain area size. One can statistically subtract the 
underlying events of the jets in such a case. The anti-kr algorithm combines virtues of 
the cone and cluster algorithm successfully and is now the most popular jet algorithm 
at the LHC. 

Naturally, the criteria to define the jets of the partons is based on continuous 
parameters, such as pr, R, which have no characteristic scale, apart from Agcp, 
being anyhow much below the typical jet momentum (>O(10) GeV). There are 
some arbitrariness on the parameter values in such algorithms, and the parameters 
may have to be optimised for each application. 


6.4.4 Calibrating Jet Measurements 


As described in Sect. 6.4.1, the jets are expected to be collimated at high energies, 
primarily since radiation in the final state is less pronounced for high- pr jets where 
the coupling constant œs is smaller. There the individual hadrons consisting of a jet 
cannot easily be resolved, and the calibration of the detector is performed at the level 
of jets instead of constituting hadrons. The result of jet calibration is often called jet 
energy scale (JES). 

Since a jet is defined by means of an algorithm, the momenta of jets depend on 
the choice of the algorithm and jet finder parameters (e.g. R). The calibration on jets 
should be repeated for each choice of the algorithm and the set of parameters. There 
is another point to choose: if the energy is to be corrected to the “particle level”, i.e. 
jets using the momentum of particles, or to the parton level, where partons are the 
input for the jet algorithm. A general consensus is that the correction to the particle 
level is to be applied, i.e. the momentum of the particle-level jet gives the reference 
to a detector-level jet, which matches in n — ¢ space. In this way, one can avoid the 
theoretical uncertainty on the correction factor from the particle to the parton level 
jets, which is expected to be improved as the theory is more advanced. 

A simplest reconstruction of jets is to start from only calorimeter information, as 
described in Sect. 6.4.3. The energy calibration for calorimeter objects is done to 
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the electromagnetic scale, ignoring that e/h # 1 (see Sect. 5.4.3), or to the hadronic 
scale after applying e/h correction. In principle, a calorimeter cluster can be replaced 
with a matched track to improve the momentum resolution, if the spatial resolution 
of the calorimeter cluster is fine enough to resolve close-by particles. 

There remains still difference between the particle-level and detector-level jets. In 
addition to the factors arising from calorimetry, such as longitudinal hadron shower 
leakage and intrinsic dependence of calorimeter response on type of particles (the 
e/h ratio, response for muons and neutrinos), the following effects specific to jets 
cause significant shift in measured momentum: 


e particles escaping outside the jet area 


If the jet area is wider than the radius used in the jet finder, the particles outside the jet 
is lost and the jet energy is underestimated. As explained above, the size of the jet is 
wider for low energy jets since the partons are radiated more often. This leak, however, 
should be a part of the jet definition for both parton and particle-level jets. Further leak 
occurs when the particles are bent by solenoidal magnetic field applied to the central 
tracker. This is to be corrected through the jet energy scale. 


A gluon radiates another partons more often than a quark because of the different colour 
factor (9/4 vs. 1). In general, a gluon jet is wider than a quark jet and particle spectrum 
is softer because of more radiation. A gluon jet contains more particles with smaller 
average energy than for a quark jet. This again leads to more leaks by the magnetic 
bending of particles. 


An additional correction depending on the jet properties, such as the transverse size of 
the jet or the number of tracks matched to the jet, would improve the energy resolution 
of jets. 


e response of heavy-quark jets 


Some shift may remain for c-quark and b-quark jets (c/b-jets). A b-jet may decay semi- 
leptonically, to a lepton (e, u, T), a neutrino and a lighter c- or u-quark jet. The c-quark 
jet may decay again semi-leptonically. As a consequence, c/b-jets may contain one or 
more electron or muon and one or more neutrinos. The momentum of neutrinos cannot be 
measured; moreover, the muon leaves only up to two GeV energies in calorimeter (MIP). 
Therefore, the energy responses for b- and c-quark jets are, in general, smaller than other 
kinds jets (u, d, s or gluon jets—often called “light flavour jets”). The actual difference 
depends, then, on many factors, e.g. on how the muon momentum is taken into account, 
the e/h ratio, which may affect the jet energy containing an electron etc. Anyhow, it is 
a common practice to apply additional correction if a jet is identified as a heavy quark 
jet. 


e pile-up 


One can safely assume that the events that pile up on top of a collision of interest 
are all soft interactions (see Sect. 2.5). The soft interaction events are often called 
“minimum-bias” events since, the events are taken through triggers as little requirement 
as possible, e.g. small energy in very forward part of the calorimeter. Average pr from 
such minimum-bias events at the LHC energy (v/s ~ 14 TeV) is about 2.4GeV per unit 
of n — ¢ space. This means that a jet with radius R = 0.4 with 20 additional minimum- 
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bias events have average offset of pr by about 24 GeV, hence gives a very large shift in 
energies for jets with PTjets < O(100) GeV. In order to reduce the influence from pile- 
up particles, the expected average pr from pile-up is subtracted. The actual value to 
be subtracted depends on the number of pile-up events. The average number of pile-up 
can be estimated from the luminosity of the collisions. The Poisson fluctuation from 
the average can further be corrected by measuring the number of interactions per bunch 


crossing through, e.g. Npy, the number of primary vertex per crossing reconstructed 
from the central tracker. 


The residual difference from imperfect simulation of the detector and remain- 
ing miscalibration of detectors is corrected by in-situ measurements of jet response. 
The most common way to determine the overall jet energy scale is to find and use 
some physics processes with a jet whose energy can be deduced through energy- 
momentum conservation. In the hadron collider experiments, for example, the pro- 
duction of y+jet or Z+jet is widely used as the calibration source, where the photon 
or Z reconstructed from dilepton can be the reference to the jet energy because the 
fluctuation of energy deposited by the electromagnetic shower or measured charged 
track momentum is much smaller than that by the hadronic shower, leading to more 
precise energy measurement than that by the hadron calorimeter. However, since the 
momentum conservation is hold only in the plane perpendicular to beam axis in the 
hadron collider, what is conserved is pr, not p. More concretely, in y+jet event, the 
jet energy scale is adjusted so that the pr of the jet is equal to that of the photon. The 
result of the calibration is illustrated in Fig. 6.27. The clear peak can be seen in the 
y+jet events with the peak close to unity as expected. 

With the similar concept of calibrating the electromagnetic scale, Z — qq can be 
used in principle as the calibration source of the jets with the Z mass as the reference 
target. However, this method does not work in the hadron collider experiments in 
practice, because of the overwhelming dijet backgrounds generated by QCD process. 


Fig.6.27 The ratio of pr of 
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In addition, the jet energy resolution is much worse than that of the electromagnetic 
energy measurement, resulting in the difficulty to see the resonant peak from Z > 


qq. 


6.5 Reconstructing Missing Momentum 


The momentum of the neutral particles, such as neutrino and unknown neutral par- 
ticles, are not detected by collider detectors. For hadron colliders, the longitudinal 
momentum of such particles cannot be known due to the lack of longitudinal momen- 
tum information of the collisions. The missing transverse momentum, denoted as 
either PTmiss or ETmiss, can still be reconstructed by the negative of the transverse 
momentum vector of observed particles. A first approximation of the sum of the 
visible particle momentum could simply be obtained from the x and y component of 
the calorimeter cell energies, Fj ce1 sin 0; cos ġ; and E; cei sin 6; sin ġ;. This would 
miss muon momenta and also detailed calibration depending on the final state objects 
are ignored. Instead, one may measure each category of final state objects separately, 
with proper calibration and possibly with a help of tracking and muon detectors, for 
example, 


PTmiss = 5 [o$ T Pr + p% +prt+ pie“ Ze pis | 


as is done for the ATLAS experiment. Here pomers term is the momentum of the par- 
ticles belonging to neither of objects identified as charged lepton, jet nor a photon. 
This term, often called as “soft term”, includes the rest of the particles accompanied 
with the hard interaction, such as particles from ISR, multi-parton events and under- 
lying events, which should also be added to the prmiss calculation. The soft term, 
however, also includes particles from pile up. Since average transverse energy of a 
minimum-bias event is about 100 GeV, the total transverse energy of an event with 
>20 pile-ups would be about 2 TeV and increases proportionally as a function of 
the number of pile-up. The resolution of this pile-up component directly affects the 
missing pr calculation. Therefore, the performance of the missing pr reconstruction 
strongly depends on how to estimate the missing vector from the soft term, and to 
less extent through the jet term. In addition, any misreconstruction in the detector, 
such as noise in the calorimeter, affects to prmiss through the soft term. 

Various algorithms are developed in order to mitigate the growth of the resolution 
with the number of pile-up events. A simple algorithm is to reconstruct the soft term 
by using only the calorimeters or the central tracker. The latter has a benefit that it can 
remove all the track momentum not originated from the vertex of the hard scattering 
in concern. It misses, however, the contribution from the neutral particles like z? 
and K Da A further refined algorithm could be to reweight the calorimeter soft term 
by the momentum fraction of the soft term from trackers from the primary vertex 
(PV): Xtracks,PV PT/Xtracks. Each estimator would have different resolution and tail. 
Now suppose that we find long tail in for zero missing- pr events on truth level. It is 
found there that tails of the soft term of tracking and calorimeter algorithms are not 
strongly correlated. For that reason, it is often useful to use more than one algorithm 
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to reduce the tail induced by the prmiss reconstruction. This also indicates that the 
choice of the soft term reconstruction depends on the type of events in concern. 


6.6 Identification of b-Jet and t-Jet 


In many types of physics analyses in the collider experiments, the identification of 
b-quark jet (b-jet) or t-jet is of particular importance. For example, the Higgs boson 
has the large branching fractions of H —> bb and H —> t+t~.The top quark decays 
into b and W with almost 100% of probability. This section describes the methods 
to identify b-jet and t-jet. 


6.6.1 b-Jet 


There are two approaches for the b-jet identification, or b-tagging. The first one 
exploits the fact that b-hadron generated at the collision point travels a few mm 
before the decay (ct of B? is 455 um for example), leaving the secondary vertex or 
the collection of tracks that have large values of the impact parameter with respect 
to the primary vertex. We refer this type of b-tagging as the track-based tagging 
below. The second one exploits the fact that the b-hadron decay is associated with 
leptons with high probability because the branching fraction of the semi-leptonic 
decay of b-quark is approximately 11%. In addition, b-quark decays to c-quark plus 
something with the probability close to 100%, where the semi-leptonic branching 
fraction of c-quark is about 10%. Hence, the existence of a lepton nearby a jet can be 
a signature of b- or actually also c-quark jet. We refer this second type of b-tagging 
as the soft lepton tagging. In either method, all jets are the candidate of b-jets, i.e. all 
jets are examined if they are originated from b-quarks. In the actual application, two 
methods are often combined. Or more precisely, there are some branches in the track 
based tagging, and the discriminants from each tagging method are often unified 
with a multivariate analysis technique for better discrimination. 


6.6.1.1 Track Based Tagging 

The track-based b-tagging makes use of the difference in lifetime between b-hadrons 
and other more common particles generated by the collisions, such as pions or pro- 
tons. As schematically shown in Fig. 6.28, b-hadrons fly typically a few mm from 
the collision point before the decays, hence producing particles that emerge from the 
space point away from the primary vertex. On the other hand, light quarks, such as 
u, d, or gluon, generate only light hadrons that appear from the beam-beam interac- 
tion point, causing charged particles associated with the primary vertex. Using this 
difference, the track-based b-tagging algorithms search for either tracks with their 
impact parameter significantly away from zero, or explicitly reconstruct secondary 
vertex formed by the decay products of b-hadrons. One thing the reader should keep 
in mind is the existence of particles generated at the primary vertex even in the b-jet, 
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Fig.6.28 Schematic drawing of b-jet. do is the impact parameter of the tracks with respect to the 
primary vertex. The signed impact parameter is defined to be the distance of the do projected to the 
jet axis with a sign, where the sign is defined to be positive if the track crosses the jet axis in the 
region towards jet direction in a view from the primary vertex, and negative if the crossing point is 
behind the primary vertex. The decay length, Lxy, is defined to be b-hadron flight length or more 
specifically the distance between the primary and the secondary vertices in x-y plane 


because of the bi-products of b-quark hadronisation, which is also shown in Fig. 6.28. 
These particles sometimes degrade the b-tagging capability because they mimic jets 
originated from light quarks. This becomes more striking if the momentum or energy 
of the original b-quark is higher. More energetic partons end up with more particles 
through hadronisation, while the number of decay products of b-hadron does not 
depend on momentum of parent b-hadron or b-quark, i.e. the fraction of particles 
from the primary vertex compared to the one from the secondary vertex increases as 
the b-quark gets harder. 

Figure 6.29 shows the signed impact parameter significance of the tracks in the 
simulated tf events. The definition of the signed impact parameter is explained in 
the figure caption of Fig. 6.28. As can be seen, b-jets have more tracks with the 
large value compared to the other jets originated from u-, d-, s-quark, or gluon, 
which are referred to as light jets. By the way, the perfect detector that has the 
infinite position resolution would not give us the negative value of the signed impact 
parameter because the vector drawn from the primary to the secondary vertex should 
be the same as the jet direction. Therefore, with such detector, if existed, the signed 
impact parameter distribution has monochromatic peak at zero plus the tail to only 
the positive side due to the contribution from b- or c-hadrons etc. In other words, 
the negative value is caused by the detector resolution. The width of the peak around 
zero, therefore, represents the detector resolution. 

The most simple application in the track-based tagging is to just count the number 
of tracks with some selection criteria to pick up the tracks that do not come from 
the primary vertex. In slightly more complicated approaches, the likelihood of the 
track impact parameters is formed and used as the discriminant. One example is 
the jet probability algorithm, where the probability density function of the impact 
parameter for the tracks that come from the primary vertex is created a priori, and 
then the likelihood value of each charged particle to be consistent with the one from 
the primary vertex is calculated. In many cases, there are many tracks in a jet, and 
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Fig.6.29 Impact parameter significance distributions of tracks inside b-, c-, or light jets. The dis- 
tributions are obtained in the ATLAS group simulation. Reprinted under the Creative Commons 
Attribution 4.0 International License from [16] © 2015 CERN for the benefit of the ATLAS Col- 
laboration 


hence the likelihoods assigned to each track are combined to form a likelihood or 
discriminant for a jet in concern. The benefit of this method is that one examines if a 
track is compatible with the hypothesis that it comes from a collision point, and no 
priori knowledge is required for b-jets. 

In further application, one can construct the b-jet likelihood based on the proba- 
bility density function for the tracks produced by the b-hadron decays a priori, for 
example, by the simulation. Taking the likelihood ratio for the b-jet hypothesis and 
the light-jet hypothesis would give us the improved discrimination power over the 
jet probability where only the light-jet hypothesis is used in principle. In the actual 
application, special care needs to be taken to form the b-jet probability density func- 
tion. We must use a correct probability density function and would like to confirm 
that a priori knowledge or simulated data reproduces real data, but it is not so easy to 
extract non-biased b-jet sample with high purity. This is in contrast to handling the jet 
probability tagging where the light-jet sample can be as easily accumulated with high 
purity. Hence, the jet probability method is more robust, although the discriminating 
power is less and suitable for the usage in the early stage of the experiment. 


Another type of track-based b-tagging algorithm explicitly reconstructs the sec- 
ondary vertex caused by the decay of b-hadron. The secondary vertex is reconstructed 
as already described in Sect. 6.1.4. After finding the secondary vertex, the b-tagging 
algorithm usually set a threshold on the significance of the decay distance defined 
in Fig. 6.28, which is the distance from the primary and secondary vertices, or the 
flight length of the b-hadron in the plane perpendicular to beam axis. 
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Once the secondary vertex is formed, some extra information, such as the invari- 
ant mass calculated from the tracks associated to the secondary vertex etc., can be 
extracted, resulting in high rejection power for the light jets than the impact parameter 
based b-tagging method. However, the efficiency is the key issue, because one needs 
at least two tracks in the secondary vertex finding, while even one track can give us 
the discriminating power at some degree in the impact parameter based tagging. 


In either types of algorithms, the impact parameter based or the secondary vertex 
reconstruction, one also needs to remove the tracks which emerge from the secondary 
vertex although their origin is not b-hadron. Tracks generated by the decay of Ks 
and A, and by photon conversion are the typical example. In many applications, the 
algorithm looks for two-track combination whose invariant mass is consistent with 
Ks, A and photon and removes them in the track list to consider. 


6.6.1.2 Soft Lepton Tagging 

A b-jet contains charged lepton nearby with high probability, which comes either 
from direct semi-leptonic decay of a b-hadron or from the cascade decay through 
c-hadron. Another possible source of charged leptons is the leptonic decays of W or 
Z. But they don’t produce additional jets, i.e. such leptons are “isolated”. This differ- 
ence, isolated vs. non-isolated, is very frequently and efficiently used to discriminate 
whether the charged lepton in question is originated from the b-jet or W/Z. The 
typical application to identify b-jet is therefore to require a jet to have a charged 
lepton nearby, for example, AR between the jet and charged lepton is required to be 
smaller than a threshold. This technique is called as soft lepton tagging. 

In principle, we can use any charged leptons for the soft lepton tagging. However, 
only muons are used in practice because of the difficulty in identifying t’s and non- 
isolated electrons. The identification of t-jet is discussed in the next section. The 
non-isolated electron often shares its electromagnetic shower with the shower or 
energy deposit by the constitutes of the jet. This is in contrast to the muon case. It’s 
only muon that can penetrate the calorimeter and reach the muon detector even with 
jets. Hence, non-isolated muons can be still identified with high efficiency and low 
fake rate. 

The possible background source of the soft lepton tagging with muon is either 
the punch through (see Sect. 3.3.2) or decay of hadrons. This is basically the back- 
ground in the muon identification. Thus, the muon identification capability mostly 
determines the performance of the soft muon tagging. 

To achieve higher b-jet selection efficiency or suppress the fake contribution from 
light jets, a kinematical requirement for the non-isolated muon is sometimes imposed. 
Suppose we know the direction of the jet axis. This axis is a good approximation of 
the initial b-quark momentum vector, or flight direction of the b-hadron, which is 
produced by the hadronisation of the initial b-quark. In this process, the non-isolated 
lepton momentum transverse to the b-hadron flight direction or approximately the 
jet axis can be as large as a half of the b-hadron mass. On the other hand, there is no 
mechanism for hadrons yielded from light quarks to get the momentum transverse to 
the jet axis rather than the tiny contribution in the hadronisation process. Thus, the 
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transverse momentum relative to the jet axis can give us some discriminating power 
between b-jets and other types of jets. 


6.6.2 t-Jet 


The t identification is classified into few categories based on that it decays either 
leptonically or hadronically, and the number of final states particles. The branching 
fraction of leptonic decay is about 35%. In the remaining 65% cases, the t decays 
hadronically with one charged particles (one prong) with the fraction of about 50%, 
and with three charged particle (three prong) with the fraction of 15%. 

In case of leptonic decays, the t identification is actually the identification of 
isolated electron or muon as the final state consisting of either electron or muon and 
neutrino which is not detected. In absence of another neutrino, the sum of momen- 
tum vector of the isolated electron or muon, and missing Er can be treated as the 
momentum of t. This is rather straightforward and cleaner method compared to the 
identification in hadronic decays. 

The identification for hadronic decays is more complex, but important because of 
the larger branching fraction. In the hadronic decay, there are one or three charged 
particles often associated with extra neutral particles such as 7°. Since we are now 
dealing with rather high momentum t’s, the decay products are boosted and colli- 
mated, resulting in a t-jet. The particles inside the t-jet are basically decay products 
of the t, hence the number of particles, which does not depend on t momentum, 
is typically smaller than the one in quark or gluon induced jet (hadronic jet) where 
the number of particles depends on the momentum. This leads to the fact that the 
particles or the energy carried by the particles inside the t-jet are more collimated 
than that in the hadronic jets, given the same jet energy. In addition, the hadronisation 
process could generate a particle whose momentum relative to the jet axis is greater 
than the half of t’s mass due to QCD radiation. On the other hand, the maximum in 
t decay is a half of t mass. This is another reason why the t-jet is more collimated. 
In terms of the width of shower shape, the other important point we have to care is 
electromagnetic shower, which mimics t shower. As we have seen in the previous 
sections and chapters, the size or width of electromagnetic shower is smaller than 
the one of hadronic shower. This means that electron could be the fake of t, if you 
just select collimated jet. Therefore, we have to require the jet width to be narrower 
than hadronic jet and wider than electromagnetic shower at the same time. 

Another feature of t-jet is that t has a finite lifetime, whose ct = 87 um, causing 
a possible decay vertex in addition to the primary vertex created by collisions. This 
means that the track-based b-tagging can also give us the discrimination of t-jet from 
the hadronic jets in principle. However, this lifetime is much shorter than that of b- 
hadrons. Therefore, using a method similar to track-based b-tagging alone does not 
produce sufficient discriminating power. Still the track information helps to identify 
T in cooperation with the jet shower shape variables. 

The actual t-jet identification starts from reconstructing a jet. Usually, no special 
jet clustering algorithm for t-jet is used. The similar one for the hadronic jet is used 
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with the parameters possibly tuned for t-jet clustering. The jet considering here is 
a cluster based on energy deposit in the calorimeter. The next step is to select and 
associate the tracks to the t candidate jet. Some quality cuts and the requirement on 
pT are the standard criteria. If necessary the t-jet candidate is categorised into one 
or three prong based on the number of associated tracks. 

Here, we show you some variables that are actually used in the ATLAS t-jet 
identification. Figure 6.30 shows the fraction of calorimeter energy in the region 
AR < 0.1 to the total energy in the jet for Z — tt or W — tv MC and for real 
data where most of the jets originate from light-quarks or gluons. Figure 6.31 shows 
the maximum of AR between the tracks inside a jet and t-jet axis. As can be seen 
from these two figures, the energy flow of the t-jet is concentrated on the centre of 
the jet. 
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The t identification algorithm nowadays exploits the multivariate analysis such as 
likelihood, neural network or boosted decision tree based on the variables discussed 
above or some other variations. 
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Event Simulation 


Not only real experimental data but also simulated data are necessary for modern data 
analyses because experiments, that is, detectors, etc. are getting complex so that we 
need the help of Monte Carlo (MC) simulation to understand the experimental data. A 
Monte Carlo method is a technique to simulate high-energy physics interactions and 
detector responses by applying random samplings to probability distribution mod- 
elling experiments. In collider experiments, MC simulated events (MC events or MC 
samples in short) are produced event-by-event; in the case of proton-proton colliders, 
each event corresponds to a bunch crossing, where several pp collisions (pile-up) 
may occur. MC simulation is also useful to design new experiments and detectors. 


7.1 Overview 


We outline the production of MC events with MC simulation, where three steps are 
considered: event generation, detector simulation and reconstruction as shown in 
Fig. 7.1. 

In the event generation step, we produce events, for example, two photons from 
Higgs bosons in proton-proton collisions (pp > gg — H — yy), or two quarks 
from Z bosons in electron-positron collisions (ete~ —> Z — qq). Unstable parti- 
cles, whose lifetime is short enough not to reach detectors, are decayed according to 
branching fractions which are obtained from experimental measurements or theoret- 
ical predictions. The output of this step is a list of particles with various information, 
for instance, energy, momentum, production and decay positions, status and relation 
between particles, i.e., parent and children. 

In the detector simulation step, we simulate our detector responses to the stable 
particles produced in the previous step. For example, in the case of electrons, they lose 
energy by interacting with detectors: produce electron-hole pairs in pixel and silicon 
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Fig. 7.1 The flow of the production of MC events (full detector simulation) and real data for data 
analysis. The Monte Carlo method is used in the event generation and detector simulation steps but 
not in the reconstruction step. The reconstruction step should work for both MC events (solid) and 
real data (dashed) 


detectors, ionise particles in gas detectors and produce EM showers in calorimeters. 
These energy deposits are converted to charges if necessary. The format of outputs is 
the same as that of the real data from detectors because the next step should be applied 
to both real data and MC simulation. The detector simulation should ideally be as 
precise as possible but it depends on the requirements of physics achievements and 
the technical limitations, for example, modelling of detectors, computing resources, 
etc. The pile-up effect in the pp collisions can be taken into account after the detector 
simulation, for example, we prepare several events of the inelastic interactions and 
mix them following the number of collisions per bunch crossing. 

In the reconstruction step, we reconstruct events from the output of the detector 
simulation and identify particles. Reconstructed objects in each event are still candi- 
dates of particles. For instance, electron candidates mean that they are reconstructed 
from EM calorimeter clusters matched with charged tracks. They come from elec- 
trons (called true electrons) or fake electrons (z~, t, etc.); note that the fraction of 
true electrons is not so high. Then, identification programs are applied in order to 
select, for example, true electrons from electron candidates as much as possible. In 
other words, fake electrons are rejected as much as possible. As a result, the frac- 
tion of true electrons in the selected electron candidates becomes high. Momentum 
and energy are also calculated for each object including calibrations if possible. The 
output of this step is used in the data analysis. 


7.2 Event Generation 


An MC event is produced with several steps, where each step uses different pro- 
grams. We explain the outline of how to produce an event using a concrete example 
of the t7 process in pp colliders with several keywords often used for MC production: 
matrix-element event generator, parton density function, parton shower, fragmenta- 
tion, harmonisation, underlying event, etc. The detail of theoretical aspects can be 
found in books, for example, [1]. Then, we give concrete computing programs used 
in the ATLAS experiment. 
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Fig. 7.2 Feynman diagrams of tf process in the pp collisions: gg > t,t > Wtb,W* > 
ud,t — Wb, W~ — £y with s-channel (left) and t-channel (right) 


7.2.1 Production of tt Process 


Let us consider the production of tf events in pp collisions: tf, t > Wtb,i > W~b 
where one of W bosons decay into quarks and the other into leptons. Figure 7.2 shows 
two Feynman diagrams of this production: gg > tf, t > Wtb, Wt > ud, f > 
Wb, WT — £D. This is so-called a hard-process part. Matrix-element event gen- 
erators (ME generators) are used for the production of the hard-process part. ME 
generators can produce events by considering all the diagrams which have the same 
final state. ME generators also perform the simulation of a gluon (g) from a proton 
in the proton-proton collision. The momentum fraction of gluons is described by 
parton density functions (PDF), which are obtained based on QCD and experimental 
measurements. Gluon and quark PDFs depend on the energy of the interaction and 
we need to define an energy scale to evaluate PDFs, for example, as top mass for 
tt production. This scale is one of the important parameters in the MC production, 
which is called a factorisation scale uF. 

However, the production of a hard-process using ME generators is not the end 
of the story but a starting point of the event generation. To produce MC events, 
several different steps (parton shower, fragmentation/hadronisation, etc.) have to be 
performed as shown in Fig. 7.3. Lots of quarks and gluons with relatively small pr 
are emitted using a method of parton shower, that is, soft and collinear emissions. A 
Sudakov form factor, which is a probability not to emit a parton until a target energy 
scale, is calculated to perform the MC method. There are two different types of show- 
ers for such emissions: initial state radiation (ISR) and final state radiation (FSR); 
the theoretical idea behind them is similar, however, ISR is complicated due to so- 
called backward evolution, which is “tracing the showers backwards in time” (see, 
for example, [2]). As shown in Fig. 7.3, the use of ISR and FSR depends on when the 
emissions happen; ISR (FSR) is for radiations before (after) the hard-process. ISR 
gluons and quarks are radiated from the initial gluons of gg — tt but FSR gluons 
are radiated from quarks (u and/or d) of Wt = ud. 

Quarks and gluons cannot be observed due to the colour confinement (see 
Sect. 6.4), so that they need to be combined to compose hadrons. This procedure 
is called hadronisation. For example, to produce a B meson (B°(bd), Bt (bu)), u 
or d quarks are produced from parton shower/fragmentation and one of them is 
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Fig. 7.3 Details of MC event production of tf from the proton-proton collisions: parton show- 
ers (ISR, FSR), fragmentation, hadronisation, photon radiation and underlying event. The hard- 
process shown with a dashed box is the same as the left plot of Fig. 7.2. Redrawn from Fig. 1 of [3] 


combined with a b quark. New additional quarks and gluons are created from the 
existing quarks, gluons or vacuums, which are gluon fields in this case. This step is 
called fragmentation but is sometimes included in the step of hadronisation. Photons 
can be radiated from charged leptons and quarks, which are called QED radiative 
corrections. The remaining parts of protons not used in the hard-process are called 
“underlying event” and must be treated properly with parton showers, hadronisa- 
tion and fragmentation. Finally, the decay of mesons, baryons and leptons is per- 
formed until particles produced from decays become “stable”. Kinematics (energy 
and momentum) of all the particles are determined under conservation rules (energy, 
momentum, spin/polarisation, etc.) with the MC method. 

Feynman diagrams shown in Fig. 7.2 are the leading order (LO) of gg —> tt 
process and there is no additional gluons and quarks. However, we can use ME 
generators to produce more high pr gluons and quarks. For example, when we 
consider one additional strong coupling, additional gluon or quark can be emitted, 
which should be treated by the ME generators instead of the parton shower. This is a 
part of the contribution of next-to-leading order (NLO). Gluons or quarks produced 
by ME generators and by parton shower are properly treated to avoid a double 
counting, which is briefly explained in the next section. 
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7.2.2 Event Generators 


Many event generators are available on the market. Several main generators used in 
ATLAS for Run 1 and Run 2 data analysis are listed in alphabetical order: ALPGEN [4], 
HERWIG [5-7], MADGRAPH (MADGRAPHS_AMC @NLO) [8], MC@NLO [9], PHO- 
TOS [10], POWHEG [11], PYTHIA [2,12], SHERPA [3], TAUOLA [13], etc. Since dif- 
ferent generators have different features, each generator has its own pros and cons. 

HERWIG, PYTHIA and SHERPA are multi-purpose event generators. They can do all 
the steps explained before including parton showers, fragmentation, hadronisation 
and decay. Not only LO but also NLO calculations for ME are available for a part 
of processes in these generators. However, other ME generators like MADGRAPH, 
POWHEG, etc. are often used for NLO in ATLAS. HERWIG and PYTHIA can be used 
for simulating parton showers, fragmentation, hadronisation and decay in these ME 
event generators. The theoretical idea of the parton shower is different among gener- 
ators: angular ordering for HERWIG and pr (or kr) ordering for PYTHIA. Models for 
fragmentation and hadronisation are also different: cluster model for HERWIG and 
string model for PYTHIA. Their difference is often used as systematic uncertainties 
from the parton shower and fragmentation/hadronisation models. HERWIG has three 
major versions; HERWIG, HERWIG++ and HERWIG 7. PYTHIA has two major ver- 
sions: PYTHIA 6 and PYTHIA 8. HERWIG++, HERWIG 7 and PYTHIA 8 were often 
used in ATLAS in Run 2 compared to HERWIG and PYTHIA 6 since new features and 
techniques of event generations were only implemented in HERWIG++, HERWIG 7 
and PYTHIA 8. 

ALPGEN, MADGRAPH and SHERPA are multi-leg generators. They can produce 
events with ME including multi-partons. The multi-partons are so-called additional 
partons or additional jets, so that we often call such physics processes X+jets, for 
example, W+jets, Z+jets, etc. The idea behind such additional jets with ME (ME 
jets) is that the modelling of jets produced with parton showers (PS) might not 
work well because the parton shower is based on soft and collinear approximation. 
Figure 7.4 shows one of the diagrams for W+2-jets and these additional quark and 
gluon associated with a W boson can be produced by either ME or PS. Calculations 
based on ME is in principle correct (see also the next paragraph, however); when 
the existence and behaviour of additional jets are critical in data analysis, the use 
of ME generators is recommended to describe additional jets, in particular, high 
pr jets, but we should be aware that the implementation of loop corrections, etc. 
may depend on each generator. For example, in the SUSY searches, typical SUSY 
events from gluinos and squarks have several jets in the final state (e.g., gg > 
gg gqmi°qq %1°), and one of dominant background processes is Z > vý+jets. 
The “+jets” of the Z process must be modelled well to predict background, so that 
SHERPA is used for Z plus up to 4-jets with ME in ATLAS [14]. 

There is one important thing to be considered, which is called a jet-parton match- 
ing. Even if multi-leg generators are used to produce additional jets with ME, 
the parton shower has to be applied to produce additional jets, in particular, low 
pr jets. Practically we assume that ME takes care of high pr jets and PS does 
low pr jets because ME jets are well modelled in high pr region. To ensure this 
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Fig. 7.4 Feynman diagram 
of W+jets: 2 jets (a quark q e 
and a gluon) are associated 
with a W boson 


assumption, reconstructed high pr jets must match to partons produced by ME. In 
addition, jets in some phase spaces are produced by both ME and PS, which leads 
to double counting of events. For this purpose, MLM prescription [15], CKKW- 
matching procedure [16,17], etc. are applied and events are discarded if they cannot 
satisfy their requirement. To explain MLM prescription in ALPGEN, let us consider 
the production of W+up to 2-jets with a ME generator and jets with pr > 20 GeV 
are used for data analysis. First, events are produced with W with 0 ME-jet, W with 1 
ME-jet and W with 2 ME-jets, separately, where additional partons with, for exam- 
ple, pr > 15 GeV are produced by a ME generator. Note that the parton shower 
is also applied, so that some high pr jets might be produced by PS. Then, all the 
reconstructed jets with pr > 15 GeV are checked if they matched to ME-parton and 
we count such jets. For W with 0 (1) ME-jet, we require such jets should be exactly 
0 (1). For W with 2 ME-jets, we require such jets should be 2 or more. Then, we 
merge the remaining events to make a W+up to 2-jets events. 

MADGRAPH, MC@NLO, POWHEG and SHERPA are used as NLO generators for 
some specific processes. Additional one parton and also loop diagrams up to the 
next-to-leading order are properly taken into account. 

PHOTOS generates QED radiative corrections for charged leptons and quarks. 
TAUOLA is a program to simulate tau-decay including polarisation properly. They 
are optionally used in PYTHIA and HERWIG.! 


7.2.2.1 Cross Section 

MC events are produced by using event generators, which can provide their cross 
sections including branching fractions. However, in many cases, we don’t use cross 
sections provided by event generators but values obtained from dedicated programs, 
because such programs can perform more higher order calculations than event gen- 
erators. NLO event generators are available for most important physics processes on 


' The recent version of PYTHIA 8, for example, 8.2 can treat tau-decay polarisation properly without 
TAUOLA. We need to check the updates to use any generators. 
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Table 7.1 Cross section measurements at the ATLAS experiment (./s = 13 TeV) with theoret- 
ical predictions. The measurements are given with statistical and systematic uncertainties. The 
predictions are given with combined uncertainties including PDF, as, scales, etc. For the Higgs 
production cross section, five processes are included: gluon fusion (NNNLO), VBF (approximate 
NNLO), V H (NNLO/NLO), ttH + tH (NLO) and bb H (NNLO/NLO) 


Process Measurement Prediction (higher References 
order for QCD) 

W —> £v (=e, u) |20.64+0.02 20.08+9:65 nb (NNLO) | [18] 

0.70 nb 

Z— U(l=e,u) |1969 156 pb 1886725 pb (NNLO) | [19] 

tt 826.4 + 3.6 + 19.6 pb | 832132 pb [20] 
(NNLO+NNLL) 

Higgs 55.4 + 3.1532 pb 55.6 + 2.5 pb [21] 
(NNNLO, etc.) 

ZZ 17.3£0.6+0.8 pb |16.9+0$ pb (NNLO) | [22] 

ttW 0.87 + 0.13 + 0.14 pb | 0.60 + 0.07 pb (NLO) | [23] 


the marker but NNLO event generators are limited. On the other hand, the dedicated 
programs can calculate cross sections up to NNLO (or higher for some processes). 

There is an important parameter, i.e., a renormalisation scale ug, which is the 
scale at which the strong coupling is evaluated in order to calculate cross sections. 
The value of cross sections does not depend on the choice of upr, however, since 
we cannot perform complete calculations including all the orders of the strong 
coupling, the calculated cross sections might depend on up. In the high-energy 
region (> O(100 MeV)), a perturbative method works well in QCD like QED and 
we can calculate cross sections, for instance, up to next-to-next-to-next-to-leading 
order (NNNLO, N3LO or N?LO) for the Higgs gluon fusion production. 

Including the higher order calculations, cross sections can be properly predicted 
and they are well consistent with the experimental results. Table 7.1 shows the results 
of W, Z, tt, Higgs, ZZ and tt W production cross section measurements as concrete 
examples with theoretical predictions. 


7.3 Detector Simulation 


Particles produced in the event generators are detected through the interactions with 
several detector components (materials). To reach the detector volume including the 
beam pipe, particles has a long enough lifetime, that is, they are “stable particles” 
from a viewpoint of the detector simulation. Such stable particles are electrons (e~), 
photons, 7~, kaons (K®, K es KP), u™, protons (p/p), neutrons and neutrinos in 
case of the SM. In addition, in the SUSY and other new physics models beyond the 
SM, some particles, for example, the lightest neutralino is stable in the SUSY models 
with R-parity conservation. Some of the stable particles, for example, n=, K5, K x 
and u™ can decay according to their lifetime in the detector volume, which is done 
in the detector simulation step, not by the event generator. 
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GEANT4 program [24] is a detector simulation toolkit widely used in the experi- 
mental particle physics. We build each detector component and define its interactions 
based on its real detector in order to emulate how particles interact and how much 
energy of particles is lost; energy loss and multiple scattering (charged tracks in track- 
ing volumes); electromagnetic interaction/shower; hadronic interaction/shower; etc. 
This is based on the best knowledge of the particle interaction with materials. GEANT4 
traces particles step-by-step and simulates their interaction. This is a reason why MC 
simulation with GEANT4 is called full simulation. 

There are different types of MC simulations: “fast simulation” and “parametric 
simulation”? In the parametric simulation, the detector response is described by 
expected resolution functions for each stable particle. Momentum and energy are 
smeared with the resolution functions. The effects of reconstruction and identifica- 
tion programs, that is, their efficiencies are replaced with weights or MC methods 
following their expected performance. “Fast simulation” is sometimes the same as 
the parametric simulation but this term is also used in the case that a part of the 
detector simulation step, for example, calorimeter response, is replaced with a faster 
algorithm to emulate detector response. In terms of the modelling of the real data, in 
general, the full simulation is better than the fast and parametric simulations. How- 
ever, from the point of view of execution time, the full simulation is much slower; for 
example, in some extreme cases, several minutes per event with the full simulation 
but less than a few seconds with the parametric simulation. If we need lots of events 
to reduce the MC statistical uncertainties, the use of fast or parametric simulations 
is one of the options. In addition, it takes a much longer time to develop computing 
programs with GEANT4. 
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Examples of Physics Analysis 


In this chapter, we present the analysis of Higgs and new physics searches as exam- 
ples of data analysis. The data handled here are already calibrated and the particle 
identification for each object is also done.! In the so-called “data analysis” of the col- 
lider experiments, the event selection, background estimation, and signal extraction 
or measurement including evaluating systematic uncertainties are performed. 


8.1 Higgs 
8.1.1 Higgs Production Mechanism in Hadron Colliders 


There are some different processes of Higgs production. Figure 8.1 shows the Higgs 
production cross sections in pp collisions as a function of Higgs mass. The largest 
contribution comes from the gluon fusion (Fig. 8.2a), in which there is no additional 
topology or feature other than the Higgs production. Hence, the inclusive analysis 
(see Sect. 2.3) is enforced as long as we consider the gluon fusion process. On the 
other hand, the final states of the other three processes contain not only Higgs but 
also extra particles, resulting in the characteristic topologies. 

The second-largest cross section is via vector boson fusion (VBF) process where 
either Ws or Zs radiated from quarks couple together producing a Higgs boson 
(Fig. 8.2b). The quarks radiating W or Z bosons appear as forward jets, because their 
pr which tends to be close to the W or Z mass, is much smaller than the momentum 
of colliding protons. In addition, since this process does not contain any colour 


' An object might be possible to be a different type of particle, for example, an electron or a tau. 
The final particle identification, that is, the assignment of a particle type to each object depends on 
data analysis. 
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Fig. 8.1 Higgs production cross section as a function of Higgs mass at ./s =8 TeV. Reprinted, 
under the Creative Commons Attribution 3.0 License from [1] © 2013-2022 CERN 
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Fig. 8.2 Feynman diagrams of Higgs productions 


exchanges between the incoming quarks, no parton radiation would exist around 
the produced Higgs or the detector central region, in contrast to the overwhelming 
multijet background where not only hard jets but also many soft jets are produced. 
Putting what is mentioned so far together, the Higgs production through the vector 
boson fusion process has a very unique topology with two forward jets and with 
little QCD activities (partons due to colour exchanges) in the central region except 
for Higgs decays. The feature allows us to significantly reduce the background due 
to multijet productions as well as the other types of background. 

Another important production mechanism is the associate production with a vector 
boson, i.e., either W or Z (Fig. 8.2c). In case the W or Z decays hadronically, it 
does not help to improve the signal-to-noise ratio due to the overwhelming multijet 
backgrounds. However, leptonic decays of W or Z produce isolated leptons, allowing 
us to significantly improve the signal-to-noise ratio with a cost of the small branching 
fractions of W and Z. 

The production cross section of associate production of tf (Fig. 8.2d) is one order 
of magnitude smaller than that of W H production. It is still accessible because of the 
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characteristic topology. This production mechanism has special importance because 
this allows the direct access to the top Yukawa coupling. 

Below we describe the basic idea of the analysis for H > yy, H — bb, and 
H > WHW. 


8.1.2 H > yy 


The Higgs boson was discovered in the ATLAS [2] and CMS [3] experiments in 
2012. In this discovery, H > yy and H > ZZ* — £4¥'¥' channels played the 
most important role because they can reconstruct the invariant mass of the Higgs 
boson precisely compared to other channels, for example, H — WW* —> €v0/v' 
even if the expected statistics for H > yy and H > ZZ* > ¢¢¢'C' is not high. In 
the distribution of the invariant mass of the Higgs boson candidates, we can observe 
a clear peak of the signal on top of the background events, which is one of the most 
reliable evidence of a resonance particle to claim its discovery. In this section, we 
explain how to search for the Higgs boson with the H — yy channel in the ATLAS 
experiment. 

As mentioned before, the signal statistics is limited since the branching ratio of 
H — yy is very small, about 0.2%, for the mass of around 125 GeV, while thanks to 
a good resolution of diphoton invariant mass myy, a narrow resonance was expected 
to be observed on a huge but smooth background as shown in Fig. 8.3 [4]. Below 
we’ll explain how to obtain this result. 

We need two photons to reconstruct the invariant mass of diphotons, which is a 
final discriminant to extract the signal. Events having two photon candidates must 
be recorded in the offline storage to perform the analysis and diphoton triggers (35 
and 25 GeV for photon Er) were used for the trigger selection. Since events with 
jets faking photons, which are called fake photons, are not negligible, we cannot use, 
for example, single-photon triggers with a low Er threshold like 25 GeV.” In the 
analysis, two photon candidates were selected with pr > 40 GeV and 30 GeV, which 
are high enough to ensure the offline selected events achieve 100% trigger efficiency. 
This is acommon technique in the physics analysis, because the estimation of trigger 
efficiency is not easy in general, especially for the momentum close to the turn-on of 
efficiency. We can avoid using such events near the trigger turn-on by requiring much 
higher pr in offline selection compared to the trigger level. In this way, the source of 
possible large systematic uncertainty can be removed with the cost of losing some 
fraction of signal events. 

There are three different processes in the background events: two real photons, one 
real photon+one fake photon, and two fake photons, which are called yy, y+jet, and 
dijet, respectively. These background events do not make a peak but a smooth falling 
curve in the diphoton invariant mass m,,, distribution as shown in Fig. 8.3. These 


2 120 GeV or higher is required to use single-photon triggers, which is much higher for the photons 
coming from the Higgs boson of 125 GeV mass. 
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Fig. 8.3 Invariant mass distribution of diphotons for the combined 7 and 8 TeV data in ATLAS. 
Reprinted under the Creative Commons Attribution License 3.0 from [4] Copyright © 2013 CERN. 
The result of a fit to the data with the sum of a SM Higgs boson (126.8 GeV) and background is 
superimposed. The lower panel shows the residuals of the data with respect to the fitted background 


compositions can be measured using photon identification variables, for example, 
an isolation variable. Their fractions were determined to be ~74% for yy, ~22% 
for y+jet, and ~3% for dijet. In addition, the Drell-Yan process (Z“)/y > ete, 
DY) remains with ~1% of the background due to hard-bremsstrahlung. 

It is important to improve the resolution of the m,y distribution. For this pur- 
pose, we need to measure photon energy and also the angle between two photons 
as precisely as possible. Since the EM calorimeter has three layers longitudinally 
in ATLAS, the direction of photons can be determined from the measurements of 
photon cluster positions. The production vertex of diphotons is calculated from the 
direction of two photons. This method is called calo-pointing. The position obtained 
with the calo-pointing is precise enough in terms of the myy resolution while a 
more precise determination is required for the association of charged tracks to jets 
because jets from pile-up are identified using this association information. The pro- 
duction vertex position is finally obtained by using several information, for example, 
charged tracks not matched to any photons, charged tracks from conversions, the 
balance between two photons and charged tracks, etc. The resolution of the my, 
is about 3%, and events with two unconverted photons have better resolution than 
those with at least one converted photon about 10% in relative. 

Selected events are classified into several categories for two reasons; the first 
reason is to improve sensitivities for the search itself, which is called a global search 
here, and the second one is to measure properties of specific production processes, for 
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example, VBF and V H processes using extra leptons, jets, and the missing Er. For 
example, 14 categories were introduced in the 8 TeV data analysis using jets, leptons, 
and transverse missing energy, where 2 for VBF, 3 for V H, and the other 9 categories 
for the improvement of the discovery sensitivity. There was about 30% improvement 
in the global search sensitivity compared to the result without categorisation. 

The event excess, which is a signature of Higgs decays, is evaluated with a local 
po, which is a probability of how similar an observed distribution is to that with a 
background-only hypothesis. If the po value is 0.5, it indicates that the observation is 
consistent with the background-only hypothesis, that is, no excess. If the po value is 
smaller (larger) than 0.5, it means there is an excess (a deficit)? over the background. 
In addition, if a search is performed for a new narrow resonance (~4 MeV in case of 
SM Higgs boson) with an unknown mass in the invariant mass distribution (m,, = 
[110, 160] GeV in case of SM Higgs boson), we need to take into account the so- 
called look-elsewhere-effect. This effect can properly treat the fact that excesses like 
30 due to the statistical fluctuation could happen even if there is no new resonance 
in the search region and the frequency of such fake excesses becomes high in case of 
narrow resonance searches.* The po value after taking this effect is called a global 
po- This effect is negligible in the case of broad resonance searches due to an 
intrinsic particle width, worse detector resolutions, etc. With the full dataset of LHC 
Run 1 in ATLAS (2011-2012), the largest excess with respect to the background-only 
hypothesis (based on local po) was observed (expected) with 7.4 (4.3)o at 126.5 GeV 
as shown in Fig. 8.4 [4]. 


3 For the Higgs search in ATLAS, po = 0.00135 (2.85 x 1077) corresponds to 3(5)o, which is 
based on a one-sided limit. 

4 For example, we can assume that resonances with either 4 and 40 GeV width could exist in the 
mass range of 110—150 GeV. In this case, we may see more statistical fluctuations for 4 GeV signal 
than 40 GeV because the overall behaviour of the 40 GeV signal is not changed in the search range. 
5 In Ref. [2], the global significance of a local 5.90 excess is estimated to be about 5.10 in the mass 
range of 110-600 GeV. This result includes H > yy, H > ZZ* —> ¢€t'é', and H > WW* > 
Lve'v’. 
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8.1.3 H — bb 


This section outlines the analysis of H — bb. As the branching fraction of H —> bb 
is the largest (~58%) among the various decays of Higgs of 125 GeV mass, H —> bb 
could be the most useful and natural decay mode to search for Higgs and to study its 
properties in view of the statistics. On the other hand, the signature of the final state 
consists of just two b-jets. There would be no issues in the case of the et e~ colliders 
such as ILC, which provide a very clean environment experimentally, resulting in a 
very high signal-to-noise ratio. In the hadron colliders, however, the study of H —> bb 
is not straightforward at all because of the overwhelming QCD backgrounds (multijet 
background processes). At the energy of LHC, for example, the production cross 
section of inclusive b-jets is larger by the eighth order of magnitude than that of the 
Higgs. In addition, the identification of b-jets is not perfect. Light jets can mimic the 
signal. In this case, any jet production can be a background, whose production cross 
section is even higher than the inclusive b-jet cross section. Therefore, at the hadron 
colliders, we need some clever ideas to separate the H —> bb signals from the huge 
background. 

In the following, we discuss the analysis method of H — bb using the vector 
boson fusion process first and then the associate production of W and Z. 


8.1.3.1 Vector Boson Fusion Process 

The final state consists of two b-jets decayed from Higgs and two forward jets. 
Since there are no isolated leptons or large missing Er which are commonly used 
to trigger an event, careful study and the optimisation of the trigger are needed. The 
most apparent choice of the trigger would be to require four jets with relatively high 
pr. In addition a requirement on the topology, i.e., the existence of two forward jets 
in different 7, respectively, may be applied if such a topological trigger is available. 
Even with the requirements above, still the remaining events would be dominated 
by the multijet background because of the huge production cross section. In order to 
suppress the multijet events further, the existence of a muon (see Sect. 6.6.1.2) that 
arises from the semi-leptonic decay of b-hadrons (directly or through the cascade 
decay to c like b —> cf~v) may be required with a cost of statistics. Even though 
there are two b-hadrons (and hence two c-hadrons followed by the decay of b-hadrons 
most of the time), the branching fraction of semi-leptonic decay is only the order of 
10% (see Sect. 6.6.1) . The pr of the lepton from the semi-leptonic decay is not so 
large. Because of these two factors, the signal efficiency is relatively low. Therefore, 
one has to optimise the trigger condition with a careful study. In other words, this is 
where the improvement potentially exists. 

The offline analysis starts by selecting events with four jets. Out of the four, two 
are required to be in the central (rather small |7|), and the other two in the forward 
region (= forward jets). The forward jets tend to keep the direction of the parents’ 
protons, and hence to be in the opposite region in 7 In order to select only the 
VBF process, commonly used requirements for the forward jets are to have a large 
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separation in 7 between the two, where if one is in 7 > 0 then the other must be in 
n < 0, and to have large invariant mass reconstructed from the two forward jets. 

Once an event passes the selection criteria for the forward jets, the remaining 
part is rather straightforward. The two central jets must be identified as b-jets, where 
there is always a room of the optimisation or tuning of the b-tagging requirement. 
For example, a requirement of at least one b-tag is also possible. The tightness of 
b-tagging requirement is another knob for tuning. Finally, we look for a signal peak 
in dijet mass distribution, which is reconstructed from the two central jets. 


8.1.3.2 Associate Production with W or Z 

The idea behind using the associate production with W/Z is to exploit an isolated 
lepton from W/Z decay to reduce background. In both trigger and offline event 
selection, an event is required to have at least one isolated lepton with some criteria 
such as pr or 7. Then in the offline selection, W can be identified by reconstructing 
transverse mass from the isolated lepton and the missing Er. In the case of Z, dilepton 
mass is a powerful tool to separate the signal out from backgrounds. 

The procedure after selecting or tagging W/Z is very similar to that in the VBF 
analysis. The dijet mass reconstructed from b-tagged jets is the most efficient vari- 
ables to discriminate signal from background. In the end, the dominant source of 
backgrounds is W/Z production associated with heavy flavour jets, whose final 
state is exactly the same as the signal. On top of that, tf production is also a main 
component of the remaining background. Therefore, jet energy resolution to identify 
a possible peak from H — bb decay is one of the most important key elements in 
this analysis, as well as the efficiency to detect and identify the final state objects. 
Figure 8.5 shows the distribution of dijet mass reconstructed from two b-tagged jets 
in the ATLAS experiment, where all the expected background contribution, except 
for VZ, Z > bb (V = Z or W), is subtracted. One can see a peak by Z > bb as 
well as the small enhancement around 125 GeV, which is the evidence of H > bb. 


Fig.8.5 The invariant mass 
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Fig. 8.6 The diagrams of a h > WW b a W W-pair production through a Z° and ¢ a top-quark 
pair with both W bosons decaying leptonically into eu 


8.1.4 H — Ww 


8.1.4.1 Analysis Overview 

The branching ratio of the decay channel h —> WWT is about 22%, which is the 
second largest for mp = 125 GeV. Since the mass of the Higgs boson is less than 
the sum of two W boson masses (about 161 GeV), one of the two W boson decays 
virtually (A —> WW*). The analysis using 8 TeV data from the ATLAS collaboration 
is described in detail in Ref. [6]. The corresponding 13 TeV analysis is given in 
Ref. [7]; there the data analysis procedure is given briefly and refers to the former 
paper [6]. In this section, some key points of the H —> W W* analysis are described. 

The analysis uses the leptonic decay channel for both of the W bosons (h > 
WW* — €vév) (Fig. 8.6a), where £ is either an electron (e) or a muon (jz) in order 
to reduce background from multijet production pp — jets. Since the multijet final 
state can be produced with a process with only QCD vertices with strong coupling, 
the cross section of such production is many orders of magnitude larger than that 
of h + WW* signal. The decay product of the Higgs boson, therefore, is either of 
ee, ep, uu combinations with two or more neutrinos. The analysis also includes 
smaller number of events containing W — tv, decays where the t-lepton further 
decays into an electron or muon with two additional neutrinos. Only the sum of the 
transverse momenta of the neutrinos (here denoted as py”) can be measured through 
missing pr. 

Major sources of the background are resonant-like WW production (Fig. 8.6b) 
and top-pair events where both top quarks decay leptonically, t — Wb, W — £v 
(Fig. 8.6c). The former process has the same final state as the signal and is an irre- 
ducible background source if the WW pair is produced from a colourless state such 
as a Virtual Z° boson. The latter process gives two b-jets and is the main background 
for VBF production process where we require two jets in the final state and also a 
significant source for the events with one jet in the final state. 

The reconstruction of the Higgs boson mass is not possible because of the neu- 
trinos in the final state. Instead, the transverse mass, mr, is calculated to estimate 
the invariant mass of the WW* system, which uses the transverse components of 
the kinematic variables: py” (pi), the vector sum of the neutrinos (leptons), and 
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Fig. 8.7 a Relation between the spin direction and momentum direction for H > WW* —> £v£v 
decays. b Illustration of typical decay topology in x — y plane (perpendicular to the beam direction) 


E% = || (p£) + (mee). The mr is defined as 


my = J (ES + pf? — pff + ph? 


Since my < mp and mp is below twice the W boson mass, h > W W* events will 
be populated in mr region below that from resonant W W production (see Fig. 8.8.) 
The mr values for top-pair production also tend to be much beyond that from the 
Higgs decays. This shape difference is used to quantitatively distinguish the signal 
and background. The peak structure in mr, for both signal and the W W background, 
however, is broad. Also, the production rate of W W* pairs is much smaller than the 
SM diboson production. Several other features of the signal events are used to reduce 
background processes. 

The number of jets, especially the number of b-jets, is one of such key ingredients 
to classify event categories. As described in Sect. 8.1.1, at the leading order there is 
no jet for the gg F processes, while in the V BF processes, each of two incoming 
quarks emits a vector boson and recoils, giving two jets close to the outgoing beam 
direction, one for each side. This means that two forward jets are observed, with 
large separation in rapidity space. Since these forward jets in the VBF processes are 
jets from light quarks, the background from ff production is greatly suppressed by 
removing events with one or more b-quark jets. 

The azimuthal correlation of the two leptons is also used in order to further 
enhance the signal. Since the Higgs boson is a scalar particle and has no spin, the spin 
directions of the two W bosons are opposite (Fig. 8.7a). The momentum direction of 
the charged leptons in W~ — £7» decays tends to be opposite to the spin direction 
since the anti-neutrino is right-handed and its momentum is aligned to the spin 
direction. For the WT, the charged lepton is emitted along the direction of the WT 
spin. As the WW pairs tend to have back-to-back topology in the x — y plane, 
the direction of the two leptons becomes close as shown in Fig.8.7b. Also, the 
invariant mass of the lepton pair, mee, is peaked around 30-40 GeV while for WW 
pair production, it is at around 60 GeV, as seen in Fig. 7b in Ref. [6]. 
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Since the signal-to-background ratio is quite small in W W* decay channel, the 
amount of the remaining background is still very large after selecting Higgs-like 
events using the properties given above. The remaining background depends strongly 
on the number of accompanied jets. The events are, therefore, classified according to 
the number of jets: 0-jet, 1-jet, and > 2-jet categories. The main background sources 
for the 0-jet category are irreducible W W production and other diboson production, 
especially W Z events where one of the leptons is missed. In addition, the events from 
W + jets production contributes significantly if the jet is misidentified as a lepton. 
Here, the W + jets process represents higher-order DY events gq > W*, i.e., with 
one or more associated jets. For the 1-jet category, the tf production becomes also 
significant since it produces two b-jets where one of the jets is experimentally not 
tagged as a b-jet. For the 2-jet events, the major contribution is the tt events. The basic 
idea of how to suppress these background events is described in the next subsection 
for each category. 


8.1.4.2 Background Reduction 
e (0-jet category 
After the basic requirement of having two leptons in the final state, significant miss- 


ing Er and explicitly requesting no jet, most of the background is the DY process, 
pp > Z°/y* + X, Z°/y* — ee, wu, tt, especially when the two leptons have the 
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same flavour (ee or uu). This background is also significant for eu channel, however, 
since both the t leptons in pp > ttt~ X processes may decay leptonically, giving a 
eu pair. 

In order to further reduce the DY events, the correlation of the lepton pair is used, by 
requiring pr of the dilepton system being high: pe > 30GeV (Fig. 7a in Ref. [6]). Since 
the Z°/y* in the DY processes are produced from q@ annihilation, each of the quarks 
coming from the incoming protons, the transverse momentum of the produced Z°/y* 
tend to be small and the lepton pair from the decay tends to be produced back-to-back 
in the x — y plane. 


The missing Er may arise from background processes through mismeasurement of the 
energy or momentum of the final state particles, i.e. two leptons. In such cases the 
missing Er tends to be aligned to the momentum direction of these particles. A few 
requirements are applied based on the relative momentum of the missing pr to the 
leptons. 


Finally, the azimuthal correlation requirement (@ge < 1.8) and the mass of the dilep- 
ton system mee < 55 GeV are required to select events with H — WW* topology as 
described above. 


e 1|-jet category 


The event selection for the 1-jet category is very similar to that for the 0-jet events apart 
from a few points: the required jet should not be tagged as a b-jet; př is replaced to 


pe! , adding the momentum of the jet; and additional requirement on the mzz variable 
is imposed: mz, < mz — 25GeV where mz is the mass of the Z? boson. The mr: 
variable is calculated by using so-called “collinear approximation” assuming that the 
leptons are from the decay of t leptons originated from Z° and the momentum of the 
rest of the t decay products, two neutrinos for each decay, are estimated by projecting 
the missing pr vector to the two lepton directions. 


e 2-jet category 


The signal-to-noise ratio for two-jet VBF categories is much smaller than the other 
categories at the stage after dilepton + missing Er selection. In order to enrich the 
signal, a machine-learning technique (boosted decision tree, BDT) is used. The detail 
of the technique is beyond the scope of this book. Here we merely explain the main 
variables used as inputs for the machinery. Two variables related to the forward-going 
two jets, the jet-jet mass mj; and the rapidity difference between the two jets yj;;, play 
main role in the selection since the two jets in the VBF process tend to have large values. 
Some other variables related to the angular order of the VBF jets and the decay products 
of the Higgs boson are used to enrich the VBF process, based on the fact that the Higgs 
boson is produced in between the two jets, each of which goes into near the outgoing 
beam direction on the opposite sides (see Fig.8.2b). In addition, since the VBF is a 
quark induced process without QCD vertex (see Sect. 8.1.1), the amount of the initial 
and final state radiations from partons are largely suppressed with respect to the main 
background process, the tf production. The vector sum of pr over hard objects in an 
event is sensitive to the amount of such radiation since the size of such vector indicates 
the amount of recoil received by the objects. 


Figure 8.8 shows the mr distribution of the events after all the selection for eu 
channel. A clear excess over the sum of the background is observed for both 0- and 
1-jet categories. The amount of the excess divided by the expected number of events 
predicted by the Standard Model Higgs boson production cross section is called 
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signal strength parameter (denoted as jz). The value of jz is extracted from the fit to 
mr distributions of all the event categories after fixing the background distributions 
including their normalisations, as described below. 


8.1.4.3 Background Estimation 

It is difficult to determine the amount of background events through template fit 
assuming the shape of the signal and background and determining the normalisa- 
tion of each contribution through the fit, since the mr distribution for the signal is 
relatively broad and the shape is somewhat similar for the signal and some of the 
background events as seen in Fig. 8.8. The background contribution is, therefore, 
estimated by using event distributions in control regions where some of the selec- 
tion criteria are inverted so that there is no overlap in events between the signal and 
control regions. 

In this analysis, the control regions are prepared for each process for each category 
of events (0, 1, or 2 jets, eu or ee + uu final states) and for each background process 
(WW, top, Drell-Yan, etc.). Instead of going through all of them, we pick up a few 
most relevant ones. 

For example, the normalisation of the WW contribution is obtained by events in 
high meg region, 55 < meg < 110 GeV for the 0-jet category so that the purity of the 
WW contribution is improved, while keeping similar event selection criteria to the 
signal region. The remaining background sources from non-W W processes in this 
control region are subtracted by using simulated events. 

The strongest constraint for normalising tf contribution comes from 1-jet cate- 
gory ey final state, but requesting one b-tagged jet explicitly, since all top quarks 
practically decay to the bW final state. In addition, the requirement on lepton is 
tightened by requesting m$ > 50 GeV, where m$ is defined as the mass between one 
of the leptons and missing pr vector on the x — y plane. It is meant for reconstruct- 
ing the transverse mass of the W bosons from the top-quark decays. After applying 
these criteria, the control region consists almost fully of top-quark production. Thus 
determined background fraction gives consistent results with simulation for most of 
the control regions, despite the fact that the event selection for H — WW may be 
at the corner of the phase space for the background processes. 

After repeating similar exercises for other event categories, the normalisation 
factors for the background processes as well the signal contribution are finally fixed 
by performing a simultaneous likelihood fit, where some of the normalisation factors 
are allowed to shift while others are fixed. The final result for the 8 TeV analysis gives 
w= 1.09+0-18 (stat) t917 (syst). The main sources of the systematic uncertainties 
are theoretical origin, like the cross section prediction of the signal itself, since the 
strength parameter is the cross section ratios of measurement to prediction. For the 
13 TeV analysis, the statistical uncertainty was improved and became lower than the 
total systematic uncertainties. 
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8.2 Search for Physics Beyond the Standard Model 


One of the main goals of the high-energy experiments is the discovery of new phe- 
nomena so that there is plenty of data analysis for physics beyond the standard 
model (BSM). From among them, we explain SUSY (supersymmetry) and reso- 
nance searches, which are typical BSM searches; their idea can be applicable to data 
analysis for other BSM. 


8.2.1 SUSY 


Many searches for phenomena beyond the Standard Model target a signal that does 
not make any resonance of new particles. One of the best examples is SUSY search. 
The supersymmetry is a new fermion-boson symmetry, where new fermion (boson) 
partners are introduced for all standard model bosons (fermions). The supersymmet- 
ric partners of electron (e), weak boson (W), quark (q), and gluon (g) as examples 
are scalar electron (selectron, €), wino (W), scalar quark (squark, 7), and gluino (g), 
respectively.° They have the same mass as their partners; however, we have not 
seen such particles so far. This symmetry is assumed to be broken and the mass of 
supersymmetric partners can be heavy. The lightest supersymmetric particle (LSP) 
is assumed to be neutral and stable (under R-parity conservation) and cannot be 
detected so that the LSP is a good candidate for dark matter. This is one of the 
motivations for SUSY models. 

In R-parity conversed SUSY models, a pair of SUSY particles, which are new 
particles for us so far, can be produced in the LHC and then each SUSY particle 
decays eventually in SM particles and one LSP. Due to the existence of the LSP in 
the decay chain, we cannot reconstruct the mass of any SUSY particles which are 
produced in the decay chain. Even in such cases, there are several useful variables 
to search for the SUSY signal and more variables are being developed. 

Since the LHC is a pp collider, we expect large production cross sections of SUSY 
signal via the strong interaction: gg > gg, gg —> qq, and gq — gq as shown in 
Fig. 8.9. The search for SUSY with these channels is of importance in the LHC. 
We focus on the search for SUSY through the gg — gg production process, where 
we assume that the other SUSY particles except for the lightest neutralino x? are 
heavier than gluino g. 

In such a simple scenario, gluinos decay into two quarks plus x1 0 via g > qq > 
qqx ie giving four quarks (including anti-quarks) and two x 1° in the final state. In this 
analysis, we require four or more high pr jets and a large missing transverse energy. 
Additional high pr jets might come from the initial and final state radiations. In 
nominal SUSY searches, there are two useful variables to separate signal events from 
background events: missing transverse energy piiss and so-called “effective” mass 


6 The naming convention of supersymmetric partners is the prefix of “s-” for fermions and the 
postfix of “-ino” for bosons. 
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Fig.8.9 Feynman diagrams of SUSY production via strong interaction in the pp collider 


Meff. The mept variable is defined to be the scalar sum of the transverse momentum 
of jets and pais: meff = Jija prt Bees, The number of jets that are added in 
the summation depends on analyses, for example, up to four in the pr order. The 
mMeff Variable corresponds to the mass of the SUSY particle pair initially produced. 
Figure 8.10 shows the meg, distribution for gluinos and squarks search in ATLAS [8]. 
SUSY signal events can have dumps in the high ae and meg¢ regions. This is a 
typical SUSY signal for which we have searched. 

In practice, we are moving to a more complex analysis to improve the signal 
sensitivity since any SUSY signal has not been seen in the LHC. We have adopted 
multivariate analysis techniques like BDT, deep learning (DL), etc. The variables of 
Be and mer are one of the input variables to them. In these analysis techniques, 
not only each input variable but also the correlation of input variables are utilised to 
separate the signal from the background. Since the selection criteria are determined 
by using the MC events, for example, BDT or DL is trained with MC samples, we 
are careful that the correlation of variables in MC events should be similar to that in 
the real data as much as possible. Such checks are required to adopt the multivariate 
analysis technique. 


8.2.2 Resonance Search 


As the charmonium was discovered by a resonance of a pair of electrons and the 
Higgs boson was recently discovered by peaks of a pair of ys and 4 leptons, it is 
historically evident that looking for any resonances of the new particles is the one of 
the most effective and the easiest ways to search for new physics independent of the 
theoretical models. 

The distribution of the invariant mass for oppositely charged muon pairs with 
transverse momentum above 4 GeV and pseudorapidity || < 2.5 and selected by 
muon triggers at the ATLAS experiment is shown in Fig. 8.11. In case two recon- 
structed muons are originated from a particle with narrow decay width such as J/W, 
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w’, Y, and Z boson, the invariant mass reconstructed by momenta and energies of 
two muons are measured to be around the mass of the resonance particle. On the 
other hand, the invariant mass reconstructed by candidates of two muons (including 
charged particles faking as muons) which are not originated from a decay of particle 
distributes continuously according to the combination of values of momenta and 
energies of the two muons. From this example of the search for the peak of “known” 
particle, one can learn that 
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e more precise measurement of the invariant mass provides a sharper peak of the 
resonance over the backgrounds, 

e the level of the reducible backgrounds due to the wrong measurement such as 
fakes needs to be lowered as much as possible, and 

e the distribution of irreducible backgrounds needs to be under-controlled to esti- 
mate the number of background events. 


The LHC experiments can search for the new resonances predicted by the BSM 
hypothesis up to 10 TeV by using the invariant mass reconstructed by the combination 
of the two or more electrons, muons, photons, and jets, which includes the decays 
of heavier particles such as top quarks. The following sections show two examples 
of BSM resonance searches. 


8.2.2.1 Dilepton Resonances 

Since we expect to measure the electron energy and the muon momentum more 
precisely than that of jets, the dilepton (dielectron and dimuon) final state is the 
most promising channel in any BSM resonance searches. From the theoretical point 
of view, various models predict resonances with decay into dileptons and can be 
categorised according to their spin. Thus, the experimentalists first search for any 
excesses in the dilepton mass distribution and then apply the result of the searches 
to the interpretation of models with such new resonances. 

The filled points in Fig. 8.12 show the distribution of the dielectron and dimuon 
invariant mass (mee) for events passing the full selection using 139 fb~! of pp 
collision data collected at ./s = 13 TeV with the ATLAS detector [10]. The event 
selection is based on the quality cuts of the electron and muon, their pr, and fiducial 
cuts. The me, distribution of the backgrounds, shown as red solid lines in Fig. 8.12, 
is modelled by formula of 


fme) = few,z (mee) - (1 — x°)? «x Dizo Pi log 0Y, (8.1) 


where x = mee/./s and b, c, and p; with i = 0, ...3 are the parameters determined by 
the fit. The function fgw,z (mee) is Breit-Wigner function with mz = 91.1876 GeV 
and Iz = 2.4952 GeV, which models the line shape of the resonance of the Z 
boson at high mass region. If new heavy particles with pole masses of 1.34, 2, and 
3 TeV existed, one could find the peaks of the dilepton mass over the background 
prediction, as shown as dashed curves in Fig. 8.12. In the prediction of these new 
particles, zero width is assumed, i.e., the width of the distributions is only due to the 
detector resolutions. Since the electron energy measured from the electromagnetic 
shower is more precise than the momentum measurement for a charged particle in 
the energy region of our interests, dielectron mass reconstructed from the energy 
measurement has better resolution than that from the momentum measurement. For 
the dimuon channel, on the other hand, only the momentum measurement is available. 
Therefore, the mass resolution of dielectron is better than that of dimuon. Figure 8.12 
does not show any sign of a signal from the new particle. If you want to quantify if 
a signal exists or not, you can calculate the probability that the data are compatible 
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Fig. 8.12 Distribution of the a dielectron and b dimuon invariant mass for events passing the 
full selection. Reprinted under the Creative Commons Attribution 4.0 International License from 
[10] © 2019 The Author. Generic zero-width signal shapes, scaled to 20 times the value of the 
corresponding expected upper limit at 95% CL on the fiducial cross section times branching ratio, 
with pole masses of my = 1.34, 2, and 3 TeV, as well as background-only fits, are superimposed. 
The data points are plotted at the centre of each bin. The error bars indicate statistical uncertainties 
only. The differences between the data and the fit results in units of standard deviations of the 
statistical uncertainty are shown in the bottom panels 


with the background-only hypothesis as is described for H — yy peak search (see 
Sect. 8.1.2). 
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8.2.2.2 Dijet Resonances 

New heavy particles, such as excited quarks (q*), that couple to partons are predicted 
in many BSM theories and can be produced directly in pp collisions at LHC and 
decayed into partons. Events of this kind of a new heavy particle produce a peak in 
the distribution of the dijet invariant mass (m ;;). On the other hand, since, in the SM, 
the production of jet pairs in hadron colliders primarily results from 2 — 2 parton 
scattering processes described by QCD, a smooth and monotonically decreasing 
distribution for the m j; distribution is expected. The filled points in Fig. 8.13 show 
the mj; distribution for events with pr > 150 GeV for the two leading jets, with 


1 
ly*| = <ly1 — y2| < 0.6, and mj; greater than 1.1 TeV, where yı and y2 are the 


rapidity of dijet [11]. The m jj distribution of the backgrounds, shown as the solid 
red line in Fig. 8.13, is empirically known to be predicted by formula: 


f(x) = pid — x)P2yP3 tpa (8.2) 


where x = mj;/,/s. Parameters of pı to p4 are determined by the fit to real data. 
If a new heavy resonance particle existed, one could find the peak of the dijet mass 
above the background prediction, as shown as open points in Fig. 8.13. The most 
discrepant interval of the m ;; distribution of the data comparing with the background 
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Fig.8.13 The reconstructed dijet mass distribution, m jj, is shown for events with pr > 150 GeV 
for the two leading jets, with |y*| < 1.2, and mj; greater than 1.1 TeV (filled points). Reprinted 
under the Creative Commons Attribution 4.0 International License from [11] © CERN, for the 
benefit of the ATLAS Collaboration. The solid line depicts the background prediction from the 
sliding-window fit. The vertical lines indicate the most discrepant interval, for which the p-value 
is 0.89 as reported in the figure. The expected contributions for g* signal with a mass of 4 and 
6 TeV are overlaid, normalised to 10 times their predicted cross section. The lower panel shows the 
bin-by-bin significance of the data-fit discrepancy, based only on statistical uncertainties 
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prediction is indicated by the two vertical blue lines in Fig. 8.13. The p-value for the 
most discrepant interval is calculated to be 0.89. 
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Statistics A 


The detail of the calculation to obtain the mean, variance, etc., which is not shown 
in the main text, is given in this section. This is useful for the beginners in statistics. 


A.1 Binomial Distribution 
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A.2 Poisson’s Distribution 
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A.3 Maximum Likelihood Method 
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